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Abstract 

Despite  a  technology  bias  that  focuses  on  external  electronic  threats,  insiders 
pose  the  greatest  threat  to  commercial  and  government  organizations.  One  means  of 
preventing  insider  theft  is  by  stopping  potential  insiders  from  actually  crossing  the 
line.  In  the  overwhelming  number  of  cases,  people  do  not  join  an  organization  with 
the  intention  of  stealing  or  causing  harm.  Instead,  something  or  often  several  some¬ 
things  happen  while  the  individual  is  in  the  organization  that  precedes  his  malevolent 
actions.  One  of  the  traits  identified  with  insiders  is  their  feeling  of  alienation  from  the 
organization.  By  datamining  emails,  an  employee’s  interests  can  be  discerned.  These 
interests  are  then  used  to  construct  social  networks  which  are  used  to  identify  individ¬ 
uals  with  interests  shared  but  undiscussed  with  other  members  of  the  organization. 
These  individuals  with  clandestine  interests  have  the  potential  to  be  insider  threats. 
This  paper  describes  the  use  of  Probabilistic  Latent  Semantic  Indexing  (PLSI)  [?] 
extended  to  include  users  (PLSI-U)  and  Author  Topic  [?]  extended  to  include  doc¬ 
uments  to  determine  topics  of  interest  for  employees  from  their  email  activity.  It 
then  applies  PLSI-U  and  Author  Topic  to  the  Enron  email  corpus.  The  results  show 
that  by  comparing  the  topics  of  emails  that  people  send  internally  with  the  ones  sent 
externally,  a  small  number  of  employees  (0.03%  -  1.0%)  emerge  as  having  clandestine 
interests  and  the  potential  to  become  insider  threats.  Most  significantly,  one  of  these 
individuals  is  Sherron  Watkins,  the  famous  whistleblower  in  the  Enron  case. 
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Detecting  Potential  Insider  Threats 
through  Email  Datamining 


I.  Introduction 

1.1  The  Insider  Threat 

“Espionage  is  the  practice  of  spying  or  using  spies  to  obtain  information  about 
the  plans  and  activities  of  a  foreign  government  or  a  competing  company”  [?].  While 
it  is  possible  to  insert  professional  spies  into  an  organization,  today,  the  use  of  insiders 
is  much  more  prevalent.  Insiders  are  members  of  an  organization  or  government  who 
often  have  a  legitimate  right  to  the  information  that  they  are  accessing.  However, 
they  abrogate  the  trust  they  have  been  given  by  using  the  information  for  illegitimate 
reasons.  Since  most  Cold  War  espionage  was  perpetrated  by  the  Soviet  Union  [?],  it 
was  expected  that  after  the  end  of  the  Cold  War,  the  spy  threat  would  decrease.  This 
has  not  been  the  case.  In  addition  to  the  increased  number  of  security  threats  from 
terrorist  organizations,  there  has  been  an  alarming  increase  in  the  number  of  cases  of 
economic  espionage.  There  is  a  significant  amount  of  evidence  [?]  that  some  nations 
are  funding  insider  espionage  by  some  of  the  corporations  based  in  their  countries. 

When  considering  who  becomes  an  insider,  the  different  motives  of  these  eco¬ 
nomic  and  security  insider  threats  make  it  important  to  consider  them  separately. 
Insiders  who  become  economic  threats  are  more  often  motivated  solely  by  financial 
considerations  while  those  who  are  security  threats  may  tend  to  have  more  divided 
loyalties.  Take  for  instance  Ana  Belen  Montes,  a  senior  intelligence  analyst  at  the 
Defense  Intelligence  Agency,  who  was  arrested  on  September  21,  2001  for  spying  for 
Cuba.  In  her  statement  at  her  sentencing  she  stated,  ”1  obeyed  my  conscience  rather 
than  the  law.  I  believe  our  government’s  policy  towards  Cuba  is  cruel  and  unfair,  pro¬ 
foundly  un-neighborly,  and  I  felt  morally  obligated  to  help  the  island  defend  itself.” 
She  never  asked  for  money  and  acted  as  she  did  only  out  of  ideology  [?]. 
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Shaw,  et  al.  [?]  describe  eight  different  types  of  insiders  (explorers,  good  Samar¬ 
itans,  hackers,  Machiavellians,  exceptions,  avengers,  career  thieves,  and  moles).  The 
two  least  common  (career  thieves  and  moles)  are  the  only  ones  who  enter  a  cor¬ 
poration  with  the  intention  of  stealing  from  it.  In  all  other  cases,  improving  the 
pre-screening  process  is  ineffective.  When  these  individuals  join  an  organization  they 
have  no  intention  of  spying  or  stealing  from  it.  What  is  needed  is  some  way  to  prevent 
these  individuals  from  becoming  insider  threats.  The  first  two  cases  (explorers  and 
good  Samaritans)  are  innocent  individuals  who  simply  stray  to  areas  of  computers 
systems  where  they  are  not  supposed  to.  Having  adequate  safeguards  can  effectively 
prevent  these  two  types  of  insiders.  In  the  other  four  cases  (hackers,  Machiavellians, 
exceptions,  and  avengers),  something  happens  which  turns  them.  In  many  cases,  this 
event  is  some  corporate  change  which  results  in  them  being  disgruntled.  Such  events 
include  restructuring,  streamlining,  being  passed  over  for  promotion  or  simply  a  bad 
review.  In  other  cases,  this  event  may  have  nothing  to  do  with  the  organization.  It 
may  be  a  personal  crisis  such  as  the  end  of  a  relationship,  the  ill  health  or  death  of 
a  spouse  or  child,  or  a  severe  financial  problem.  Faced  with  these  events,  individuals 
often  withdraw  from  the  organization  and  may  seek  relief  in  alcohol  or  drugs.  As  they 
withdraw  they  feel  alienated  from  their  organization  making  it  easier  to  overcome  any 
inhibitions  they  have  about  betrayal. 

1.2  Preventing  the  Insider  Threat 

Since  in  most  cases  detection  of  these  warning  signs  would  allow  for  early  in¬ 
tervention  and  the  prevention  of  any  harm  such  as  sabotage  and  theft,  developing 
methods  of  detection  is  vital.  A  recent  report  by  the  U.S.  Department  of  Defense  [?] 
observed  that  “Nothing  can  replace  first  rate  management  of  subordinates,  genuine 
concern  for  their  well  being,  fairness,  and  recognition  of  personal  warning  signs  for 
mitigating  the  insider  threat” .  Unfortunately,  while  this  sentiment  is  valid,  it  is  stated 
at  the  same  time  that  the  DoD  and  every  other  public  and  private  company  is  at¬ 
tempting  to  become  more  efficient  by  cutting  mid-level  management  and  converting 
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most  of  the  management  that  remains  into  working- managers.  At  the  same  time 
there  is  a  huge  increase  in  both  internal  turnover  and  external  contractors  working 
in  sensitive  positions.  It  is  not  possible  for  managers  to  effectively  get  to  know  all 
of  the  individuals  under  their  direct  supervision  to  the  point  that  behavioral  changes 
will  be  noticed.  Instead,  they  must  pick  a  few  individuals  and  focus  on  getting  to 
know  them.  What  is  needed  is  an  effective  way  for  them  to  pick  which  individuals 
to  focus  on.  The  DoD  report  [?]  acknowledges  this  and  goes  on  to  recommend  the 
“maximum  use  of  datamining  to  enable  continual  online  review  of  personnel  security 
information.” 

In  today’s  Information  Age,  one  of  the  best  sources  of  personal  information 
available  at  work  is  an  individuals’  email  and  internet  activity.  By  datamining  an 
organization’s  email,  it  is  possible  to  learn  a  lot  about  not  only  the  organization, 
but  of  the  individuals  within  it  as  well.  Datamining  can  find  potential  insiders  by 
finding  individuals  who  feel  alienated  from  the  organization  and/or  who  have  interests 
contrary  to  the  organization’s  well-being.  While  the  goal  of  this  datamining  is  to 
detect  individuals  who  wish  harm  to  the  organization,  the  privacy  concerns  of  innocent 
individuals  must  be  considered.  It  is  possible  once  personal  information  is  gathered 
for  this  information  to  be  used  in  ways  that  will  cause  innocent  individuals  harm. 
While  this  possibility  exists,  one  must  recall  that  most  organizations  explicitly  inform 
employees  that  email  and  internet  use  is  to  be  restricted  to  only  business  purposes.  If 
this  restriction  is  followed,  there  is  little,  if  any,  personal  information  that  can  later  be 
used  to  harm  individuals,  significantly  decreasing  any  privacy  concerns.  Probabilistic 
clustering  is  a  method  for  dividing  information  into  groups  of  similar  objects  by 
assuming  these  groups  come  from  a  mixture  of  several  populations  [?].  The  goal  in 
probabilistic  clustering  is  to  find  parameters  for  these  probability  distributions  that 
best  fits  the  data. 

Two  promising  probabilistic  clustering  algorithms  are  Probabilistic  Latent  Se¬ 
mantic  Indexing  (PLSI)  [?]  and  Author  Topic  [?].  Each  model  assumes  that  doc¬ 
uments  (or  emails)  are  constructed  one  word  at  a  time.  Before  picking  each  word, 
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a  topic  is  selected  and  then  the  word  is  selected  based  on  the  topic.  Author  Topic 
also  allows  for  a  similar  mechanism  that  involves  the  author  of  the  email,  allowing 
some  words  and  topics  to  be  more  likely  depending  on  the  interests  of  the  author.  By 
applying  these  models  to  email  (and  expanding  PLSI  to  include  authors  (PLSI-U)), 
it  is  possible  to  extract  topics  of  interest  for  the  organization  and  its  members.  In 
addition,  probabilistic  clustering  can  associate  individuals  employees  with  those  top¬ 
ics  that  they  have  the  greatest  interest  in.  These  topics  of  interest  can  then  be  used 
to  develop  social  networks.  There  are  several  ways  to  use  these  social  networks  to 
find  insiders.  First,  if  there  are  specific  topics  of  concern,  individuals  with  an  interest 
in  these  topics  can  be  flagged.  Second,  if  individuals  share  common  interests  with 
known  insiders,  they  can  be  flagged.  Finally,  if  individuals  fit  an  “insider  threat  pro¬ 
file”  based  on  their  topics  of  interest,  they  can  be  flagged.  Such  a  profile  may  include 
feelings  of  alienation  or  interests  in  alcohol,  drugs,  or  financial  solutions. 

1.3  Using  Enron  as  the  Data  Source 

Ideally,  probabilistic  clustering  and  social  networking  could  be  tested  against 
a  real  world  dataset  of  both  email  traffic  and  internet  activity.  However,  privacy 
concerns  have  prevented  this.  Most  research  involving  email  traffic  uses  the  personal 
email  of  the  researchers  (and  possibly  a  small  set  of  other  individuals).  It  is  very 
difficult  to  get  large  organizations  to  release  the  private  email  of  their  employees 
for  research.  The  one  large  corpus  of  real-world  email  traffic  that  is  available  is  a 
subset  of  email  from  Enron.  As  part  of  their  investigation  into  Enron,  the  Federal 
Energy  Regulatory  Commission  (FERC)  seized  Enron’s  email  and  made  a  portion 
of  it  publically  available.  While  it  only  includes  the  email  folders  of  151  employees, 
it  still  contains  over  250,000  email  messages.  Furthermore,  due  to  the  number  of 
individuals  the  emails  were  sent  to,  the  resulting  corpus  has  sufficient  data  on  over 
34,000  Enron  employees  to  make  probabilistic  clustering  of  these  individuals  possible. 

While  using  the  Enron  email  corpus  has  some  limitations  (most  noticeably  a 
bias  to  the  151  employees  whose  email  these  folders  came  from  and  a  lack  of  any 
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internet  activity  data),  it  does  have  some  benefits.  Unlike  most  organizations,  Enron 
has  at  least  one  known  insider,  Sherron  Watkins.  Prior  to  the  public  disclosure  of 
Enron’s  questionable  accounting,  Ms.  Watkins,  a  vice  president  in  Enron’s  corporate 
accounting  division,  sent  a  letter  to  Ken  Lay,  Enron’s  chairman,  detailing  the  dubi¬ 
ous  accounting  practices  and  their  likely  impact  on  Enron’s  future.  While  she  did 
not  go  outside  the  corporation,  her  activities  were  considered  an  insider  threat  by 
her  boss,  Andy  Fastow,  Enron’s  Chief  Financial  Officer,  who  upon  finding  out  that 
Watkins  had  wrote  the  letter  demanded  that  she  be  fired  immediately  and  her  laptop 
confiscated  [?]. 

Although  many  would  argue  that  Watkins’  actions  should  characterize  her  as 
a  whistleblower  rather  than  an  insider,  the  distinction  is  not  clear.  At  it  simplest, 
what  distinguishes  a  whistleblower  is  that  he  (or  she)  is  revealing  information  about 
an  illegal,  or  at  least  unethical,  practice  and  therefore  is  serving  the  public  good. 
However,  certainly  from  the  organization’s  point  of  view,  the  individual  can  still  be 
considered  an  insider  threat.  Furthermore,  if  the  person’s  motives  are  selfish,  for 
instance  revenge  for  a  real  or  imagined  slight,  the  distinction  becomes  even  fuzzier. 
In  the  final  analysis,  whistleblowers  still  function  in  the  same  manner  as  insiders.  It 
is  just  that  they  are  revealing  information  that  arguably  should  be  revealed  and  that 
they  may  be  doing  it  for  noble  reasons. 

1.4  The  Experiment 

The  goal  of  this  research  is  to  test  the  hypothesis  that  “probabilistic  clustering 
and  social  networking  techniques  applied  to  email  are  effective  at  detecting  potential 
insider  threats”.  Consider  Figures  ??  and  ??.  First,  probabilistic  clustering  finds  the 
topics  for  each  email  (in  this  case  baking  and  significant  others).  Next,  by  looking 
at  all  of  an  individual’s  email,  his  topics  of  interest  are  determined  (in  this  case, 
cooking,  significant  others,  accounting,  basketball,  sports,  and  partying)  and  a  social 
network  is  created.  Third,  his  internal  emails  are  considered  to  see  what  topics  he 
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communicates  about  within  the  organization  (in  this  case  accounting,  basketball,  and 
sports)  and  a  second  social  network  is  created. 


Figure  1.1:  Example:  An  Email  John  received 


The  two  are  compared  to  determine  what,  if  any,  clandestine  interests  he  has  (in 
this  case  cooking,  significant  others,  and  partying).  These  are  interests  he  has  that  he 
does  not  share  with  anyone  within  the  organization.  Finally,  this  list  of  clandestine 
interests  are  limited  to  only  sensitive  topics  (in  this  case  partying).  An  interest  in 
partying  that  he  doesn’t  share  with  people  within  the  organization  may  indicate  that 
he  feels  alienated,  a  warning  sign  of  a  potential  insider  threat.  In  other  cases  sensitive 
topics  would  be  about  things  the  organization  does  not  want  revealed.  For  instance, 
in  the  case  of  Enron,  the  sensitive  topic  is  about  the  dubious  accounting  practices. 


1.5  Analysis 

After  the  social  networks  are  created,  an  analysis  of  the  resulting  data  is  per¬ 
formed.  There  are  three  metrics  to  determine  if  Author  Topic  and  PLSI-U  are  appro- 
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IMPLICT  INTEREST  NETWORKS 


Socializing  Basketball  Accounting 


EXPLICIT  INTEREST  NETWORKS 


Socializing 


Basketballl 


Accounting 


Figure  1.2: 


Example:  Networks  that  include  John 


priate  for  extracting  insider  threats.  They  must  be  timely,  useable  and  valid.  Since 
some  topics  may  only  emerge  on  a  sporadic  basis,  this  technique  should  be  performed 
on  at  least  one  to  three  months  of  email.  If  as  a  result  it  is  only  run  every  one  to 
three  months,  results  that  are  produced  in  a  seven  to  ten  days  can  be  considered 
timely.  Next,  to  be  useable  it  must  be  possible  for  an  administrator  to  examine  the 
topics  produced  and  quickly  be  able  to  identify  the  underlying  topic  by  examining  the 
most  probable  words.  To  measure  this,  the  25  most  probable  words  and  individuals 
for  each  topic  are  checked  by  the  researcher  to  see  if  they  produce  easily  identifiable 
topics. 

Finally,  the  validity  of  the  probabilistic  clustering  for  insider  threat  detection 
must  be  established.  For  probabilistic  clustering  to  be  considered  a  success,  Sherron 
Watkins  must  be  identified  as  one  of  the  individuals  with  a  clandestine  interest  in  the 
topic  most  strongly  related  to  Enron’s  questionable  accounting  practices.  This  would 
indicate  that  she  had  the  potential  to  become  an  insider  threat. 

In  addition  to  analyzing  the  effectiveness  of  PLSI-U  and  Author  Topic,  the 
resulting  social  network  data  also  allows  for  the  calculation  of  the  most  central,  or 
important,  individuals.  Traditional  social  network  analysis  (SNA)  has  metrics  to 
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determine  the  centrality  of  individuals  and  the  cohesiveness  of  the  overall  networks  [?]. 
By  making  such  calculations,  the  most  probable  individuals  from  Author  Topic  and 
PLSI-U  can  be  compared  to  the  most  central  individuals  from  SNA.  Finding  the  same 
individuals  appearing  in  both  cases  reinforce  the  fact  that  Author  Topic  and  PLSI-U 
are  clustering  individuals  with  the  appropriate  topics. 

1.6  Conclusion 

Chapter  2  begins  with  a  more  detailed  discussion  identifying  threats  through 
likely  insider  personality  traits  and  motivations.  It  then  reviews  the  basics  of  prob¬ 
abilistic  clustering  as  well  as  an  in  depth  review  of  Author  Topic  and  Probabilistic 
Latent  Semantic  Indexing.  The  discussion  then  covers  social  networks,  a  technique 
used  to  effectively  expose  individuals  alienated  from  their  organization.  Chapter  3 
reviews  the  methodology  detailing  the  research  questions  and  the  evaluation  metrics 
followed  by  a  description  of  the  data  including  its  preparation,  clustering  and  analy¬ 
sis.  It  then  concludes  with  two  supplementary  experiments  performed  in  the  hopes  of 
developing  additional  information.  Chapter  4  then  discusses  the  results  of  the  exper¬ 
iments.  Both  PLSI-U  and  Author  Topic  create  good  clusters  of  topics  appropriate  to 
Enron  and  associate  appropriate  individuals  with  these  topics.  However,  only  Author 
Topic  is  successful  at  extracting  Sherron  Watkins  as  a  potential  insider.  Chapter  5 
concludes  with  a  summary  of  results  as  well  as  some  possible  next  steps. 


II.  Background 

There  are  estimates  that  as  many  as  87%  of  all  electronic  thefts  come  from  individuals 
with  legitimate  access  to  the  organization  [?].  However,  in  the  vast  majority  of  cases, 
these  individuals  did  not  join  the  organization  with  the  intention  to  steal  [?].  Some¬ 
thing  changed  along  the  way  and  they  “converted”.  This  is  not  a  new  phenomenon. 
It  has  existed  since  ancient  times  [?]  and,  despite  the  introduction  of  new  high-tech 
methods  for  spying,  insiders  continue  to  have  the  ability  to  steal  information  not  easily 
accessible  by  any  other  means. 

Unfortunately,  despite  the  prevalence  of  insider  theft  and  its  cost,  there  has 
been  only  a  small  amount  of  analysis  of  it.  Most  of  that  analysis  has  focused  on  the 
character  traits  of  insiders.  These  insiders  generally  feel  alienated  from  the  organi¬ 
zation,  possess  an  ethical  flexibility  combined  with  a  lack  of  empathy,  and  have  a 
strong  sense  of  entitlement  [?].  When  this  lack  of  social  inhibition  is  combined  with 
an  opportunity  and  a  motive,  the  conditions  are  ripe  for  a  theft  to  occur.  However, 
one  last  ingredient  is  generally  required  [?].  There  is  usually  a  trigger  that  sets  these 
individuals  off.  Triggers  include  relationship  crises,  financial  crises  and  even  health 
crises.  At  such  time,  behaviors  change  and  individuals  begin  to  contemplate  insider 
theft. 

What  is  required  is  for  managers  to  notice  these  times  of  crisis.  Unfortunately, 
as  companies  continue  to  downsize,  it  is  reasonable  to  expect  that  fewer  managers 
will  be  managing  more  individuals.  As  a  result,  there  may  not  be  enough  time  for 
managers  to  sufficiently  get  to  know  and  continually  observe  their  people.  Instead, 
what  is  needed  is  an  automated  way  of  accomplishing  this  [?]. 

By  examining  the  email  of  an  organization,  it  is  possible  to  perform  datamining 
and  find  individuals  who  have  the  potential  to  become  insider  threats.  Individuals 
who  feel  alienated  or  who  have  clandestine  interests  in  sensitive  topics  can  be  found 
by  extracting  their  topics  of  interest.  One  means  of  finding  individuals’  topics  of 
interest  is  probabilistic  clustering.  Probabilistic  clustering  divides  data  into  similar 
groups  by  assuming  that  each  datum  (an  email  this  is  a  word)  comes  from  a  mixture 
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of  several  populations  whose  probability  distribution  parameters  are  unknown.  The 
goal  of  probabilistic  clustering  is  to  find  values  for  these  parameters  that  make  the 
probability  distributions  fit  the  data.  Two  popular  probabilistic  clustering  techniques 
are  Probabilistic  Latent  Semantic  Indexing  (PLSI)  and  Latent  Dirichlet  Allocation 
(LDA).  While  PLSI  has  not  been  directly  applied  to  the  construction  of  documents 
involving  multiple  individuals,  LDA  has  been  applied  in  the  form  of  the  Author  Topic 
(AT)  model. 

Once  these  techniques  have  clustered  the  data  into  discrete  topics,  individuals’ 
interests  in  specific  topics  are  used  to  construct  social  networks.  These  social  networks 
are  then  used  to  match  people  who  have  some  of  the  tendencies  of  past  insiders.  Also, 
if  an  insider  is  known,  these  networks  are  used  to  find  other  individuals  with  similar 
interests  who  may  also  be  potential  insiders. 

It  is  difficult  finding  data  for  this  type  of  research.  Privacy  concerns  render  most 
real-world  datasets  unavailable  and  artificial  datasets  are  generally  too  small.  One  re¬ 
cent  dataset  that  is  fast  becoming  the  touchstone  of  current  information  retrieval  (IR) 
research  is  the  Enron  email  corpus.  When  the  Federal  Energy  Regulatory  Commis¬ 
sion  (FERC)  began  investigating  Enron,  it  seized  a  significant  amount  of  email.  This 
email  was  later  released  to  the  public  in  the  form  of  electronic  files.  This  initial  email 
corpus  has  since  been  sanitized,  removing  private  information,  and  cleaned  of  data 
integrity  issues.  The  result  contains  approximately  250,000  messages  involving  over 
87,000  individuals  making  it  a  wonderful  dataset  for  performing  email  datamining 
research. 

2.1  Defining  Insiders 

Sun  Tzu  observed  that  what  enables  the  wise  sovereign  and  good  general  to 
strike  and  conquer  is  foreknowledge  and  this  foreknowledge  is  only  obtained  from 
others  [?],  specifically  spies.  There  are  five  types  of  spies:  spies  inserted  into  the 
enemy’s  local  populace,  spies  inserted  into  the  enemy’s  government,  spies  who  leak 
false  information  to  the  enemy,  spies  who  bring  back  news  from  the  enemy,  and 
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converted  spies.  Converted  spies  are  double  agents,  spies  for  the  enemy  who  have 
been  converted  to  spy  on  their  former  employers.  Sun  Tzu  felt  that  of  all  of  the  types 
of  spies,  converted  spies  were  the  most  important  because  they  allowed  the  discovery 
and  use  of  the  other  four  types. 

Since  Sun  Tzu’s  time,  more  methods  have  been  developed  for  acquiring  infor¬ 
mation  from  an  enemy  or  competitor.  These  methods  include  signals  intelligence 
(SIGNIT),  imagery  intelligence  (IMINT),  Measurement  and  Signature  Intelligence 
(MASINT),  Human-Source  Intelligence  (HUMINT),  Open  Source  Intelligence  (OS- 
INT)  and  Geospatial  Intelligence  [?].  While  technical  means  have  become  more 
popular  recently,  there  are  still  many  types  of  information  that  only  HUMINT  can 
obtain  including  detailed  descriptions  of  underground  facilities  or  facilities  under  a 
jungle  canopy.  HUMINT  is  also  the  only  source  currently  available  to  provide  qualita¬ 
tive  descriptions  of  people.  Despite  the  glamour  of  spies  like  James  Bond,  the  reality  is 
that  most  human  information  obtained  comes  from  insiders.  In  2000,  the  Department 
of  Defense  (DoD)  produced  a  report  showing  that  87%  of  identified  intruders  were 
either  employees  or  other  internals  to  the  organizations  [?].  Furthermore,  in  the  last 
fifty  years,  most  insiders  in  the  United  States  (79%)  have  sought  out  foreign  agencies 
to  provide  information  to  [?].  Traditionally,  it  has  been  more  difficult  for  a  foreign 
service  to  seek  out  one  of  the  few  cleared  Americans  willing  to  betray  their  country 
and  instead,  foreign  agencies  sit  back  and  wait  for  Americans  to  come  to  them. 

To  summarize,  an  insider  is  someone  who“  steals,  spies,  or  betrays  their  loyalty 
to  their  country  or  employer”  [?].  Observe  that,  in  addition  to  theft,  this  definition 
also  includes  an  individual  who  sabotages,  extorts  or  causes  some  other  harm  to  the 
organization.  In  addition  to  be  calling  insiders,  these  individuals  are  also  called  insider 
threats  and  malicious  insiders. 

2.1.1  Becoming  an  Insider.  In  the  overwhelming  number  of  cases,  individ¬ 
uals  do  not  start  in  an  organization  with  the  intent  to  do  harm.  Instead,  something 
happens  over  time  that  results  in  an  individual  becoming  an  insider  threat.  There 
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are  four  preconditions  that  are  generally  met  before  a  person  decides  to  become  an 
insider  threat  [?].  First,  the  individual  must  have  a  motive.  Second,  he  must  have 
an  opportunity  to  commit  the  crime.  Third,  he  must  overcome  natural  inhibitions, 
e.g.  moral  values.  Finally,  a  trigger  is  needed  to  set  the  crime  in  motion,  fferbig  and 
Wiskoff  [?]  in  their  study  of  espionage  cases  found  that  the  number  one  motivating 
factor  (69%)  in  their  study  was  money,  ffowever,  among  individuals  who  sought  out 
foreign  governments  to  sell  information  to,  disgruntlement  with  the  workplace  (26%) 
was  also  a  significant  motive.  One  factor  that  has  significantly  increased  since  the  end 
of  the  Cold  War  is  divided  loyalties  with  50%  of  the  spies  caught  since  1990  citing 
divided  loyalty  (34%  over  all).  While  money  was  the  number  one  factor,  there  are 
several  contributing  causes  to  money.  In  half  of  the  cases,  indebtness  was  the  reason 
that  money  was  the  top  factor.  One  of  the  more  intuitive  reasons,  coercion,  appears 
to  not  have  been  a  significant  factor,  present  in  only  5  of  the  150  cases. 

After  having  a  motive  to  commit  the  crime,  the  next  thing  necessary  is  the 
opportunity.  There  are  two  elements  to  opportunity.  The  first  is  the  opportunity  to 
steal  the  information  and  the  second  is  the  opportunity  to  sell  it  [?].  As  mentioned 
earlier,  in  most  cases,  potential  spies  had  to  manufacture  the  opportunity  to  sell  their 
information  and  in  many  cases  were  caught  as  a  result.  Unlike  finding  a  buyer,  which 
many  people  put  little  forethought  into,  most  people  do  plan  how  to  steal  information. 
While  it  is  difficult  for  organizations  to  prevent  people  from  finding  buyers,  it  is  within 
organizations’  power  to  make  obtaining  information  to  sell  more  difficult.  As  a  result, 
there  have  been  many  studies  (especially  since  the  dramatic  increase  in  economic 
espionage  and  electronic  theft)  into  what  factors  are  most  important  in  protecting 
information.  Shaw,  et  al.  [?]  found  that  vulnerabilities  in  poor  management  practices, 
poor  employee  screening,  incomplete  termination  procedures,  missing  warning  signs, 
and  not  monitoring  online  communications  contributed  most  to  thefts.  In  contrast,  a 
DoD  study  [?]  found,  in  most  cases,  laziness  and  ignorance  of  security  policies  were  the 
causes.  ASIS  International,  largest  professional  organization  for  security  managers, 
found  theft  reductions  dependent  on  limiting  information  access,  making  information 
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and  physical  security  a  priority,  and  ensuring  management  concern  and  focus  on 
information  loss  [?].  Similarly,  CSO  (Chief  Security  Officer)  Magazine  found  the 
best  way  to  deter  thefts  was  to  train  new  employees  and  have  management  regularly 
communicate  that  security  was  a  priority  [?]. 

Even  though  many  people  have  motives  to  cause  harm  or  theft  and  the  op¬ 
portunity  to  do  it,  most  do  not.  In  addition  to  motives  and  opportunities,  in  order 
to  steal  people  must  overcome  their  natural  inhibitions  to  commit  an  immoral  act 
against  their  coworkers  and  country.  One  motivation  that  counteracts  this  is  divided 
loyalties.  If  a  person  feels  that  whichever  action  they  take,  they  are  letting  down  one 
of  their  countries  or  that  they  must  balance  country  against  family,  inhibitions  are 
much  decreased.  A  second  factor  in  lowering  inhibitions  is  substance  abuse.  In  the 
Herbig  and  Wiskoff  database  of  150  spies  [?],  alcohol  abuse  was  present  in  51%  of 
cases  for  which  data  was  available  and  drug  abuse  was  present  in  53%  of  cases  for 
which  data  was  available.  What  these  two  factors  also  have  in  common  is  that  they 
help  individuals  feel  alienated  from  their  organization.  It  is  much  easier  to  not  feel 
inhibited  from  an  organization  if  one  no  longer  feels  a  part  of  it. 

Finally,  although  substance  abuse  and  alienation  can  reduce  inhibitions,  they 
still  exist  and  still  deter  most  individuals.  However,  when  motivation  and  opportu¬ 
nity  and  alienation  are  combined  with  a  life  crisis,  people  stop  thinking.  Herbig  and 
Wiskoff  [?]  report  that  one-fourth  of  the  spies  experienced  a  life  crisis  in  the  months 
preceding  their  decision  to  become  a  spy  (e.g.  divorce,  death  of  a  loved  one,  love 
affair  gone  bad).  Dr.  Mike  Gellcs  of  the  Naval  Criminal  Investigative  service  (NCIS) 
also  argues  that  even  narcissists  and  antisocial  individuals  require  aggravation  by  a 
personal,  financial,  or  career  crisis  which  friends,  coworkers  and  supervisors  fail  to  rec¬ 
ognize  as  a  serious  problem  [?].  As  a  complement  to  these  two  traits,  Project  Slammer 
(another  study  of  Americans  convicted  of  espionage  against  the  United  States)  found 
that  there  is  also  a  different  type  of  individual  who  commits  espionage,  one  who  is 
passive,  easily  influenced,  and  is  lacking  in  self-esteem.  Although  most  spies  fall  into 
the  first  category,  there  are  several  that  fall  into  the  second  [?]. 
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2.1.2  Insider  Traits.  Disgruntlement  is  becoming  a  much  more  common  rea¬ 
son  that  individuals  become  an  insider  threat.  However,  there  are  very  few  individuals 
who  have  never  gotten  angry  at  their  supervisors  or  thought  their  organization,  or 
government,  was  not  making  bad  choices.  If  this  was  all  it  took  to  become  an  insider 
threat,  the  bad  guys  would  outnumber  the  good  guys.  Therefore,  it  make  sense  to 
look  at  some  of  the  character  traits  that  increase  the  chances  of  someone  who  is  dis¬ 
gruntled  becoming  an  insider  threat.  Shaw,  et  al.  [?]  describe  six  personality  factors 
that  commonly  appear  in  insiders:  a  history  of  personal  and  social  frustration,  com¬ 
puter  dependency,  ethical  flexibility,  reduced  loyalty,  a  sense  of  entitlement  and  a  lack 
of  empathy. 

Dr.  Mike  Gelles  condenses  these  characteristics  down  to  two:  narcissism  and 
antisocial  personality  disorder  [?].  Antisocial  people  reject  the  normal  rules  and  stan¬ 
dards  of  society  and  have  no  remorse  over  their  actions.  They  do  not  form  attachments 
to  either  people  or  causes.  They  are  therefore  often  manipulative,  self-serving  and 
place  high  emphasis  on  immediate  gratification.  Narcissists  also  have  strong  feeling 
of  entitlement  and  a  lack  of  empathy  for  others.  When  narcissists  are  criticized  (i.e. 
their  high  self-image  is  threatened),  they  may  react  viciously,  seeking  revenge  out  of 
proportion  with  the  criticism.  They  may  also  seek  out  other  groups  that  will  restore 
their  unrealistic  self-image.  The  basic  difference  between  narcissists  and  the  antiso¬ 
cial  personality  is  that  the  antisocial  personality  rejects  the  rules  while  the  narcissist 
believes  the  rules  apply  to  everyone  else  but  not  to  him  or  her. 

2.1.3  Types  of  Insiders.  Shaw,  et  al.  [?]  consider  eight  types  of  insider 
threats:  explorers,  good  Samaritans,  hackers,  Machiavellians,  exceptions,  avengers, 
career  thieves,  and  moles.  Explorers  are  people  who  are  just  curious  and  are  looking 
at  places  they  perhaps  shouldn’t  be.  They  mean  no  harm  but  have  the  potential  to 
shut  a  business  down  by  deleting  a  hie  that  they  shouldn’t  have.  Unlike  explorers, 
Good  Samaritans  have  a  purpose  and  wish  to  fix  something  that  is  broken  although 
it  is  not  in  their  area.  Unfortunately,  Good  Samaritans  like  explorers  often  make  the 
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same  mistakes  and  create  havoc  by  “fixing”  something  that  wasn’t  broken.  Hackers 
are  the  worst  of  the  innocents.  They  also  have  no  wish  to  harm  their  company;  they 
just  want  to  prove  they  can  get  where  they  know  they  aren’t  supposed  to  be,  often  to 
impress  their  friends  and  receive  peer  approval.  The  greatest  problem  with  hackers 
is  that  they  are  often  one-upped  to  the  point  where  they  end  up  doing  things  that 
are  often  destructive  and  at  least  annoying  in  order  to  prove  that  they  are  the  best. 
Machiavellians  are  the  first  group  to  fall  into  the  malicious  category.  Their  motivation 
is  the  advancement  of  their  careers  and  personal  goals  and  see  nothing  wrong  with 
acquiring  unauthorized  information  in  order  to  do  so.  Exceptions  take  this  one  step 
further  by  believing  that  they  are  exceptions  to  the  rules  and  that  the  rules  don’t 
apply  to  them.  Often  this  belief  has  been  reinforced  by  their  supervisors  who  have 
previously  given  in  to  demands  because  they  were  too  “important”  to  offend  (e.g. 
offending  them  may  result  in  the  computer  system  not  being  available).  One  subclass 
of  exceptions  are  the  proprietors  who  believe  that  the  computer  system  is  “theirs  ” 
and  they  are  free  to  do  whatever  they  think  is  best.  Avengers  also  feel  entitled,  but 
unlike  Machiavellians  who  are  trying  to  advance  themselves,  avengers  are  trying  to 
get  revenge  for  some  ill  done  to  them.  As  such,  their  goal  often  is  to  cause  damage  to 
the  organization.  Finally,  the  last  two  categories,  career  thieves  and  moles,  enter  into 
an  organization  with  the  explicit  goal  of  theft.  While  these  categories  are  the  least 
common,  they  are  the  most  dangerous  since  every  action  taken  from  the  beginning 
is  with  the  long-term  goal  of  stealing  from  the  organization.  It  is  at  least  a  little 
comforting  to  note  that  in  the  overwhelming  number  of  the  cases  reported  in  [?], 
the  traitors  did  not  fall  into  either  of  the  last  two  categories,  i.e.  they  did  not  enter 
government  service  with  the  intent  to  commit  espionage. 

2.1.4  Whistleblowers.  Observe  that  whistleblowers  are  not  mentioned  above 
as  one  of  the  eight  types  of  insider  threats.  A  whistleblower  is  an  individual  who  is 
a  current  or  former  member  of  an  organization  and  who  acts  with  the  intention  of 
making  information  public,  either  internally  or  externally,  about  possible  or  actual 
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lion-trivial  wrongdoing  in  an  organization  [?,?].  Whistleblowers  can  be  distinguished 
from  other  insiders  by  the  fact  that  they  are  revealing  information  about  an  activity 
that  violates  public  laws  or  public  trust.  However,  while  there  is  no  literature  that 
specifies  whether  a  whistleblower  should  be  considered  an  insider  threat,  some  authors 
consider  whistleblowers  to  be  insiders  [?].  Recall  that  an  insider  is  someone  who 
betrays  their  loyalty  to  their  organization.  It  can  be  argued  that  a  whistleblower  does 
just  that.  Therefore  it  is  difficult  to  avoid  seeing  a  similarity  between  the  two. 

One  possible  reason  that  many  people  distinguish  between  whistleblowers  and 
insiders  is  because  they  equate  the  insider  threat  with  the  term  malicious  insider.  Us¬ 
ing  this  definition  for  insider  threat  (i.e.  malicious  insider),  it  is  easier  to  determine  if 
a  whistleblowers  should  be  considered  an  insider  threat.  In  many  cases,  whistleblow¬ 
ers  come  forward  for  revenge.  For  instance,  in  one  of  the  Australian  Competitition 
and  Consumer  Commission’s  (ACCC)  cases,  the  whistleblower  was  an  employee  who 
discovered  his  wife  was  having  an  affair  with  his  boss.  He  then  revealed  to  ACC  that 
his  boss  was  entering  into  unlawful  arrangements  with  competitors  [?].  In  such  as 
case,  it  is  easy  to  dismiss  the  whistleblower  as  an  insider  threat.  However,  in  other 
cases,  it  is  not  so  easy.  Consider  the  seven  Hanford  pipefitters  who  were  laid  off  for 
refusing  to  install  a  faulty  valve  in  a  system  carrying  high-level  nuclear  waste  [?]  or 
Russell  Tice  who  was  dismissed  for  raising  suspicions  that  his  colleague  might  be  a 
spy  for  China  [?].  These  individuals  risked  a  lot  where  there  is  no  clear  indication 
of  personal  gain.  These  cases  point  out  that  if  a  whistleblower’s  motives  are  pure, 
then  he  probably  is  faced  with  a  conflict  between  his  loyalty  to  his  organization  and 
his  loyalty  to  the  public  at  large.  In  this  case,  is  easy  to  argue  that  he  is  not  a 
malicious  insider.  However,  it  is  still  unclear  whether  he  shoud  be  considered  an  in¬ 
sider  threat,  defined  simply  as  one  who  betrays  their  loyalty  to  their  organization. 
While  many  would  argue  that  whistleblowers  should  be  considered  different,  for  this 
research,  whistleblowers  are  considered  insider  threats. 
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2.1.5  Preventing  Insider  Threats.  What  is  critical  to  observe  is  that  of  the 
eight  types  of  insiders,  only  the  two  least  common  enter  an  organization  with  the 
intention  of  becoming  insider  threats.  For  the  rest  something  changes  along  the  way. 
What  is  needed  then  is  to  be  vigilant,  both  to  behavioral  changes  and  to  the  initial 
tentative  fumblings  of  betrayal  [?]. 

The  Defense  Personnel  Security  Research  Center  (PERSEREC)  and  Project 
Slammer  both  found  that  once  convicted,  many  of  these  spies  complained  that  if 
someone  had  stood  in  their  way,  asked  them  what  was  going  on,  shown  an  interest  in 
them  before  they  started  or  been  paying  attention  after  they  started,  the  espionage 
would  have  stopped.  One  of  the  reasons  they  began  to  commit  espionage  is  because 
they  felt  no  one  cared  about  them  and  no  one  would  notice  them  committing  espionage 
anyway  [?].  In  one  case,  a  spy  had  taken  classified  documents  into  another  room  to 
photograph  them.  While  he  was  photographing  them,  he  was  interrupted  twice  by 
people  entering  the  room.  They  saw  what  he  was  doing  but  just  excused  themselves  for 
barging  in  and  left,  apparently  assuming  he  had  a  legitimate  reason  for  his  activities 
and  that  someone  else  knew  about  it.  As  the  DoD  Integrated  Process  Team  on  Insider 
Threat  observes,  “Nothing  can  replace  first  rate  management  of  subordinates,  genuine 
concern  for  their  well  being,  fairness,  and  recognition  of  personal  warning  signs  for 
mitigating  the  insider  threat...  Managers  must  live  up  to  the  expectation  that  they 
evaluate  personnel  effectiveness  daily,  develop  the  skills  to  recognize  individuals  who 
require  special  assistance  and  provide  avenues  for  them  to  acquire  that  assistance.”  [?] 

One  place  that  managers  can  look  for  guidance  in  determining  what  constitute 
warning  signs  Perserec.  The  “Adjudicative  Guidelines  for  Determining  Eligibility  for 
Access  to  Classified  Information”  is  broken  down  into  thirteen  guidelines  address¬ 
ing  potential  areas  of  concern.  They  are:  alcohol  consumption,  allegiance  to  the 
United  States,  criminal  conduct,  drug  involvement,  emotional,  mental  and  personal¬ 
ity  disorders,  financial  considerations,  foreign  influence,  foreign  preference,  misuse  of 
information  technology  systems,  outside  activities,  personal  conduct,  security  viola¬ 
tions,  and  sexual  behavior  [?].  Unfortunately  in  most  cases,  people  do  not  have  issues 
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in  these  areas  when  they  first  go  for  a  security  clearance.  It  is  only  later  that  these 
issues  surface  making  the  initial  vetting  process  moot  and  highlighting  the  need  for  an 
ongoing  automated  process  that  would  bring  to  light  any  developing  concerns  in  these 
areas.  The  DoD  Insider  Threat  Report  [?]  spelled  this  out  with  their  recommendation 
2.7  which  advises:  “employ  maximum  use  of  datamining  to  enable  continual  online 
review  of  personnel  security  information.”  One  method  of  performing  ongoing  checks 
that  requires  relatively  little  effort  is  to  do  an  annual  financial  review  and  credit  check. 

2.2  Detecting  Potential  Insiders  by  Datamining 

During  the  first  workshop  following  the  Department  of  Defense’s  report  [?],  the 
first  priority  for  improving  the  detection  of  insider’s  misuse  was  “developing  [user] 
profiling  as  a  technique”  [?].  To  develop  these  profiles,  the  workshop  participants 
proposed  using:  hies  and  processes  normally  accessed,  periods  of  time  that  a  user 
is  logged  in,  and  keystroke  patterns.  By  comparing  old  profiles  with  current  ones, 
anomalies  (e.g.  use  of  administrator  or  logging  commands)  are  better  detected  [?]. 
While  this  is  successful  if  there  is  historical  data  to  compare  to,  the  amount  of  history 
that  is  needed  is  overwhelming. 

An  alternative  means  to  detect  anomalies  is  to  consider  an  individual’s  interests. 
There  are  several  ways  to  detect  potential  insiders  based  on  their  interests.  First,  if 
an  individual  has  a  set  of  interests  that  match  a  known  “insider  profile” ,  that  person 
warrants  additional  attention.  However,  since  no  such  “insider  profiles”  exist,  other 
methods  must  be  considered.  Second,  if  an  individual’s  interests  change  radically,  it 
may  be  indicative  of  a  personal  crisis  that  again  suggests  the  need  for  more  personal 
attention.  A  third  method  uses  these  interests  to  detect  if  an  individual  feels  alienated 
from  the  organization.  If  a  person  has  a  set  of  interests  but  does  not  share  some  or 
all  of  them  with  anyone  within  the  organization,  this  may  indicate  the  person  feels 
alienated.  Furthermore,  if  these  interests  include  sensitive  subjects  (e.g.  terrorism, 
confidential  information,  financial  issues),  this  strongly  suggests  to  the  researcher  that 
more  individual  attention  is  needed. 
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The  first  step  in  developing  these  interest  profiles  is  to  separate  the  email  ac¬ 
tivity  into  topics  or  clusters.  Clustering  is  the  division  of  data  into  groups  of  similar 
objects  [?].  The  goal  of  this  clustering  is  the  discovery  of  hidden  patterns.  Once 
the  data  has  been  clustered,  datamining  is  used  to  extract  desired  information.  For 
instance,  by  clustering  people  who  die  of  heart  attacks,  one  can  identify  groups  such 
as  smokers  and  people  in  high  stress  jobs.  There  are  several  categories  of  cluster¬ 
ing  techniques.  The  first  is  hierarchical  clustering  which  attempts  to  build  a  tree 
from  the  data.  This  is  a  very  general  technique  and  works  well  for  a  wide  variety  of 
data  types  and  granularity.  A  second  category  of  clustering  is  partitioning  relocation. 
While  hierarchical  clustering  sees  all  of  the  data  as  related  at  some  level,  partitioning 
relocation  does  not  make  that  assumption.  It  relocates  the  data  into  distinct  clus¬ 
ters  based  on  some  criterion.  Unlike  hierarchical  clustering,  partitioning  relocation 
looks  at  the  data  multiple  times  possibly  relocating  each  datum  multiple  times  as 
it  explores  more  of  the  data  set.  Probabilistic  clustering  is  a  form  of  partitioning 
relocation  that  assumes  that  the  data  comes  from  a  mixture  of  several  populations 
whose  probability  distribution  parameters  and  priors  it  seeks  to  find.  Density-based 
partitioning  is  a  simple  method  of  clustering  that  requires  the  data  points  to  exist 
in  a  Euclidean  space  and  then  uses  distance  and  density  to  determine  the  clusters. 
While  both  density-based  partitioning  and  hierarchical  clustering  are  simple  to  imple¬ 
ment  and  work  well,  they  both  suffer  from  the  curse  of  dimensionality  [?]  which  says 
that  as  the  number  of  attributes  increase  (20  is  often  considered  a  reasonable  cutoff), 
the  sparsity  of  data  results  in  the  algorithms  becoming  unstable.  While  probabilistic 
clustering  also  suffers  to  some  extent  from  the  curse  of  dimensionality,  the  effect  is 
much  less. 

2. 3  Clustering  Email 

The  first  step  in  determining  how  to  cluster  email  is  to  decide  how  to  represent 
it.  There  are  several  different  possible  information  models  to  consider. 
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2.3.1  The  Vector  Model. 


Assumption  1:  Email  can  be  treated  as  a  “’bag-of- words” ,  i.e.  re¬ 
ducing  a  document  down  to  the  words  that  make  it  up,  without 
regard  to  the  ordering  of  those  words,  still  keeps  the  essential  char¬ 
acteristics  of  the  email. 

At  first  glance  this  assumption  appears  completely  unrealistic.  Obviously  the 
order  of  the  words  matter.  For  example  the  sentences,  “the  police  massacred  the 
protestors”  is  very  different  from  “the  protestors  massacred  the  police”.  However, 
empirically  this  overly  restrictive  assumption  still  performs  very  well  [?]. 

Definition  1:  The  collection  of  email  is  called  a  corpus. 

Definition  2:  A  document  (d)  represents  an  email  message.  There 
are  M  documents  in  the  corpus. 

Definition  3:  Each  document  is  made  up  of  words ,  (wi-.w^).  For 
each  document,  consider  the  number  of  words  selected  from  a 
Poisson  distribution. 


Definition  4:  Since  there  are  a  finite  number  of  documents  and 
each  document  is  composed  of  a  finite  number  of  words,  there  are  a 
finite  number  of  words,  V,  contained  in  the  corpus.  The  collection 
of  all  words  in  the  corpus  is  the  vocabulary  where  Wi  represents  the 
ith  word  in  the  vocabulary. 

Definition  5:  A  document  is  represented  as  a  vector,  with  V 
entries.  Entry  e*  is  the  number  of  times  Wi  occurs  in  the  document. 

With  this  representation,  it  is  possible  to  define  the  similarity  between  two 
documents.  Since  each  document  is  a  vector,  it  is  possible  to  calculate  the  angle 
between  two  vectors,  i.e.  documents.  The  closer  the  angle  is  to  90  degrees,  the  closer 
the  cosine  of  the  angle  is  to  0.0  and  the  less  similar  the  documents  are.  Conversely, 
the  closer  the  angle  is  to  0  degrees,  the  closer  the  cosine  of  the  angle  is  to  1.0  and 
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the  more  similar  documents  are.  Therefore,  by  calculating  the  cosine  of  the  angle 
between  two  vectors,  similarity  is  measured  as  a  real  number  between  0.0  and  1.0. 

While  it  is  theoretically  possible  to  construct  a  vector  containing  every  word, 
considering  that  there  are  over  hundreds  of  thousands  of  words  (especially  considering 
the  multiplicity  of  languages),  reducing  the  dimensionality  is  helpful. 

2.3.2  Including  Hidden  Topics  in  the  Vector  Model. 

Assumption  2:  Emails  are  created  around  underlying  topic(s). 
Assumption  3:  The  probability  of  a  specific  word  appearing  in  an 

email  is  conditional  on  the  topic  of  that  email. 

Assumption  2  coupled  with  Assumption  3  ensures  that  each  email  has  a  specific 

topic  and  that  it  is  possible  to  determine  what  that  topic  is  based  on  which  words 
appear.  Using  these  assumptions,  it  is  now  possible  to  consider  a  collection  of  K 
topics,  ( z\..Zk )■ 

Definition  6:  A  topic  is  a  latent  structure.  Certain  words  occur 
in  a  document  because  the  document  contains  a  specific  topic. 
Depending  on  the  model,  documents  may  contain  one  or  more 
topics.  Although  there  is  no  way  to  explictly  count  the  number  of 
topics  in  the  corpus,  some  number  of  topics,  K ,  must  be  assumed 
to  exist  a  priori. 

Definition  7:  A  document  is  also  represented  as  a  vector  with  K 
entries.  Entry  is  1  if  topic  zt  occurs  in  the  document.  It  is  0 
otherwise. 

The  similarity  of  documents  is  computed  in  an  identical  manner  to  the  method 
used  for  vectors  of  words.  However,  while  the  number  of  words  is  at  least  in  the 
hundreds  of  thousands,  the  number  of  topics  is  orders  of  magnitude  smaller. 

One  of  the  weaknesses  in  all  of  the  models  considered  in  this  paper  is  in  deciding 
a  priori  what  K  is.  While  it  is  very  simple  (if  time  consuming)  to  look  at  every 
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document  to  determine  the  number  of  words  in  the  vocabulary,  there  is  no  objective 
way  to  determine  the  number  of  topics.  However,  if  too  few  topics  are  assumed, 
different  topics  are  lumped  together;  where  as,  if  too  many  topics  are  assumed,  emails 
that  should  be  of  the  same  topic  are  split  up. 

2.3.3  Generative  Models.  Now  that  models  of  representing  emails  have 
been  discussed,  the  next  step  is  to  consider  how  an  email  is  constructed.  These 
generative  models  are  then  supersets  that  include  the  vector  models  already  discussed. 
To  proceed  on  firm  theoretical  footing,  it  is  necessary  to  consider  the  underlying  means 
by  which  a  new  email  is  constructed.  Once  this  method  is  known,  it  is  possible  to  use 
its  characteristics  to  develop  a  statistical  model.  This  statistical  model  is  then  used 
to  predict  the  likelihood  that  a  specific  email  is  constructed  from  a  specific  topic,  and 
consequently  is  a  member  of  a  particular  topic. 


N 

M 

Figure  2.1:  Unigram  Model 

2.3.3. 1  Unigram  Model.  The  simplest  model  to  consider  is  the  unigram 
model  shown  in  Figure  ??.  This  model  posits  that  each  word  has  an  underlying 
probability  of  occurring  in  any  email.  When  a  new  document  is  constructed,  N  words 
are  chosen  for  the  email  based  on  an  underlying  multinomial  probability  distribution. 
Therefore,  the  probability  of  a  particular  email  occurring  is:  YliLiP(wi).  This  model 
assumes  that  there  is  no  underlying  topic  and  therefore  it  is  equally  likely  that  an  email 
is  composed  of,  for  example,  the  words  “baseball,  puff  pastry,  and  alcohol  abuse”. 
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2. 3. 3. 2  Mixture  of  Unigrams. 

Assumption  4:  Naive  Bayes  Assumption  for  words:  the  existence 

of  a  specific  word  in  an  email  is  conditionally  independent  of  the 

presence  of  every  other  word  in  that  email  given  the  topic. 

The  remaining  models  all  consider  the  notion  of  underlying  topics,  or  categories. 

Notationally,  consider  zt  to  denote  a  specific  topic.  The  simplest  model  that  includes 

topics  (Figure  ??)  is  one  that  proposes  that  an  email  is  constructed  by  first  selecting 

a  topic  from  a  multinomial  distribution  and  then  choosing  words  from  a  multinomial 

distribution  based  on  that  topic,  i.e.  choosing  words  conditioned  on  the  choice  of  topic. 

For  this  model,  the  probability  of  a  given  document  is  Xu=i  P(zi)  riy=i  P(wj \zi)-  While 

this  model  is  much  more  expressive  than  the  unigram  model,  it  does  still  restrict  the 

selection  process  to  a  single  topic. 


2. 3. 3. 3  Mixture  of  Words  and  Topics. 

Assumption  5:  Naive  Bayes  Assumption  for  topics:  the  existence 
of  a  specific  topic  in  an  email  is  independent  on  the  presence  of  every 
other  topic  in  that  email. 

To  provide  additional  flexibility,  the  mixture  of  unigrams  model  is  expanded  so 
that  prior  to  each  word  being  added,  a  topic  is  selected  from  a  multinomial  distribu¬ 
tion  and  then  the  word  is  selected  conditionally  given  the  topic  from  a  multinomial 
distribution(Figure  ??).  The  two  generative  models  that  fall  into  this  category  are 
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Hoffman’s  Probabilistic  Latent  Semantic  Indexing  (PLSI)  [?]  and  Blci,  Ng,  and  Jor¬ 
dan’s  Latent  Dirichlet  Allocation  (LDA)  [?]. 


PLSI  assumes  the  creation  of  documents  just  as  described  above.  What  is  most 
desired  is  the  joint  probability  of  a  word  wy  occurring  in  document  dj  which  contains 
topic  Zk-  However,  given  the  size  of  the  vocabulary,  the  number  of  words  in  the  docu¬ 
ments  and  the  number  of  topics,  determining  this  full  joint  probability  is  unrealistic. 
However,  it  is  sufficient  to  determine  the  probability  of  topic  Zk  for  a  specific  docu¬ 
ment.  Then  by  looking  at  the  probabilities  for  all  of  the  topics,  one  determines  which 
topics  the  document  contains  (since  they  will  have  the  greatest  probabilities).  There¬ 
fore,  the  goal  is  to  determine  p(z\d).  However,  given  the  generative  model,  there  is  no 
direct  relationship  between  topics  and  documents.  A  topic  “produces”  words  and  the 
collection  of  words  creates  the  document.  Therefore,  in  order  to  determine  p(z\d),  it 
is  first  necessary  to  consider  p(z\d,w),  i.e.  the  probability  that  topic  z  occurs  given 
document  d  and  word  w.  By  Bayes’  rule, 


p(z\d,  w) 


p(z)p(d\z)p(w\z) 

Y.z'ezP(z')p(.dW)p(m\z') 


(2.1) 


In  order  to  come  up  with  the  most  likely  probabilities,  PLSI  uses  Expectation  Maxi¬ 
mization.  It  attempts  to  determine  the  expected  p(z\d,w)  by  maximizing  the  values 
for  p(z),p(w\z),  and  p(d\z)  based  on  the  documents  in  the  corpus  (i.e.  by  using  the 
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probabilities  that  seem  most  appropriate  for  the  documents  in  the  corpus).  Formally, 


(2.2) 


(2.3) 


(2.4) 

(2.5) 


where  n(d,w)  is  the  number  of  times  word  w  occurs  in  document  d. 

Expectation  Maximization  (EM)  is  a  technique  for  extracting  probability  distri¬ 
butions  that  involve  latent  (i.e.  unobserved)  variables.  For  PLSI,  at  no  time  are  the 
topics  observed.  EM  assumes  that  each  topic  has  its  own  probability  distribution  (e.g. 
the  probability  distribution  p(w\z)  is  potentially  different  for  each  value  of  z ).  The 
corpus,  then,  is  a  mixture  of  these  different  probabilities.  For  simplicity,  EM  explicitly 
assumes  that  each  of  these  distributions  is  Gaussian.  Although,  recent  work  suggest 
this  is  an  inappropriate  assumption  [?]  for  modeling  documents,  the  empirical  results 
show  that  EM  is  effective.  EM  works  by  pretending  that  it  knows  the  parameters  of 
each  of  these  distributions.  It  then  infers  the  probability  that  each  of  the  observed 
data  points  was  drawn  from  a  specific  distribution.  At  each  step,  EM  increases  the 
log  likelihood  (and  consequently  the  likelihood)  function.  After  sufficient  iterations, 
EM  is  guaranteed  to  reach  a  local  maxima  in  likelihood.  However,  since  it  is  not 
guaranteed  to  reach  a  global  maximum,  in  practice  it  is  often  run  several  times  from 
different  starting  points  [?]. 

Unfortunately,  PLSI  is  limited  by  explicitly  using  the  corpus  to  develop  its 
model.  Although,  the  values  it  comes  up  with  are  reasonable  for  the  corpus,  there  is 
no  reason  to  think  that  they  are  reasonable  for  some  new,  not  yet  seen,  document. 
In  other  words,  while  it  is  good  at  predicting,  it  is  bad  at  generating.  As  a  generative 
model,  PLSI  implicitly  assumes  that  associated  with  each  topic  is  a  multinomial  dis¬ 
tribution,  specifically  the  probability  of  a  word  being  selected  given  that  topic.  PLSI 
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then  uses  the  document  to  determine  what  those  probabilities  are.  LDA  assumes  that 
there  are  an  infinite  number  of  multinomials  that  could  be  associated  with  picking  a 
topic.  The  multinomial  that  actually  emerges  from  the  corpus  is  just  one  possibility. 
LDA  explicitly  “picks”  one  collection  of  these  topic  probabilities  by  including  a  pa¬ 
rameter,  a,  selected  a  priori  for  a  dirichlet  distribution  (for  information  on  why  the 
dirichlet  distribution  makes  sense,  refer  to  [?,?]).  Then  9  is  drawn  from  this  distribu¬ 
tion  producing  a  multinomial  probability  for  topics.  This  mechanism  prevents  LDA 
from  being  constricted  by  the  corpus.  However,  if  LDA  was  modified  so  that  a  was  1, 
i.e.  the  probability  of  picking  any  topic  was  uniform,  then  LDA  becomes  PLSI.  Said 
differently,  PLSI  is  just  a  special  case  of  LDA  [?].  In  addition  to  9,  the  probability 
distribution  for  topics,  the  conditional  probability  of  picking  words  given  a  topic  is 
also  required.  LDA  assumes  this  joint  probability  exists  a  priori  and  labels  it  f3.  The 
goal,  then,  for  LDA  is  to  determine  p(z,  6\d,  a,  /3).  This  is  seen  graphically  in  Figure 
??. 


Figure  2.4:  Latent  Dirichlet  Allocation  Model 


Based  on  this  model,  in  order  to  create  a  document,  N  words  are  selected. 
To  pick  a  word,  a  topic  needs  is  first  selected.  Since  the  probability  of  picking  a 
topic  is  conditioned  on  9,  the  probability  of  picking  a  topic  is  p(z\9).  After  the  topic 
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is  selected,  the  probability  of  picking  a  word  is  conditional  on  the  topic,  z,  and  (3. 
Therefore,  the  probability  of  selecting  a  word  is  p(w\z,  f3).  However,  since  a  word  can 
come  from  any  topic,  Z\..Zk,  the  probability  of  each  word  is  actually  the  probability  of 
that  word  for  topic  1  plus  the  probability  of  that  word  for  topic  2  and  so  on.  Similarly 
the  probability  of  a  topic  must  also  be  considered  for  each  possible  9.  Therefore,  the 
generative  equation  for  a  document  is: 


N  K 

p(9\a)  inE  p(zj\6)p(wi\zj,  /3)  )  d6 

q=i j= l 


(2.6) 


and  a  generative  model  for  the  corpus  of: 


M  .  /  Nd  I< 

n  / p^w)  ns  p(zj\0d)p(wi\zj,  P)  )  ddd  (2.7) 

\*=i  j= i 


d=  1 


As  with  PLSI,  the  goal  is  to  compute  p(z,  9\d,  cc,  (3).  By  using  Bayes’  Rule,  the 
resulting  equation  is: 


p(9,  z\d,  a ,  (3) 


p{0 ,  z,  d\a,  (3) 
p(d\a,  0) 


(2.8) 


However,  this  equation  is  intractable  because  0  and  f3  are  linked  through  z.  To 
overcome  this,  Blei,  et  al.  [?]  use  variational  inference  to  approximate  values  of  a  and 

(3. 

Variational  methods  work  by  taking  some  function  that  is  either  computation¬ 
ally  expensive  to  calculate  or,  as  in  the  case  of  LDA,  impossible  to  compute  and  by 
introducing  additional  variables  coming  up  with  a  function  that  is  computed  easily  [?]. 
For  instance,  the  function  \nx  is  expensive  to  calculate.  However,  by  introducing  an- 
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other  parameter,  A,  a  linear  function,  Xx  —  ( InX  +  1),  can  now  provide  a  tight  upper 
bound  on  In  x. 

This  broadens  to  inference  by  expanding  the  concept  of  functions  to  probabil¬ 
ity  distributions  by  introducing  an  approximating  family  of  conditional  probability 
distributions  with  variational  parameters.  For  more  details,  refer  to  [?],  [?]. 

For  LDA,  Blci  [?]  introduces  two  variational  parameters,  7  and  (ft  for  a  and  f3. 
The  distribution  family  that  results  is: 

K 

q{0,x\j,(ft)  =  q(d\(ft)Y[q(zk\(ftk)  (2.9) 

k= 1 


(  y  J 

v  3 

(  0  ) 

(  z  ) 

n 

M 

Figure  2.5:  Latent  Dirichlet  Allocation  Variational  Inference  Model 
Then  by  using  Expectation  Maximization,  LDA  arrives  at  values  for  7  and  (ft  of: 


(ftni  OC  (3iWn  exp{Eg[ln(6»i)|7]}  (2.10) 

N 

7 i  T  ^  '  (ftni  (2-11) 

n=  1 

After  calculating  values  for  a  and  (3,  there  is  still  the  need  to  consider  smoothing 
the  probability  distribution.  It  is  still  likely  that  new  documents  will  contain  words 
that  are  not  represented  in  (3.  Therefore,  variational  inference  is  performed  again, 
adding  a  prior  approximating  variable,  rj  for  (3 .  Finally  after  performing  Expectation 


Maximization  again  and  getting  the  same  equations  for  7  and  <f>  as  well  as  an  additional 
equation,  \l3  =  r)  +  Ejli  En=i  0 dnw Jdn. 

There  has  been  recent  work  to  improve  upon  LDA.  One  proposed  enhancement 
is  to  use  Gibbs  Sampling  instead  of  Variational  Inference  in  order  to  arrive  at  the  best 
values  of  a  and  f3  [?].  A  second  proposal  is  instead  of  considering  an  infinite  number 
of  probability  distributions  for  the  topics,  consider  a  discrete  number  of  them.  Keller 
and  Bengio  [?]  have  proposed  a  Theme  Topic  Mixture  Model  (TTMM)  that  considers 
a  theme  to  consist  of  a  finite  number  of  topics.  Therefore,  given  a  particular  theme, 
there  is  a  specific  probability  of  a  topic  occurring.  I11  this  case,  given  a  finite  number  of 
themes,  LDA  transforms  into  a  model  that  is  tractable  by  Expectation  Maximization. 
What  makes  this  model  more  difficult  is  that  in  addition  to  deciding  a  priori  how  many 
topics  there  are,  to  make  the  model  work  one  must  also  decide  a  priori  how  many 
themes  there  are.  Finally,  Minka  and  Lafferty  have  proposed  that  the  variational 
methods  used  by  Blei,  et  al.  [?]  lead  to  “inaccurate  inferences  and  biased  learning”. 
Instead  they  propose  using  expectation-propagation  to  produce  better  estimates  of 
a  and  f3  [?]  which  works  by  iterative  applying  “deleting/  inclusion”  steps  on  the 
integral. 

2. 3. 3. 4  Extending  Mixture  of  Words  and  Topics  to  Users.  While  the 
original  objective  for  developing  LDA  and  PLSI  was  to  organize  collections  of  text 
(i.e.  documents)  into  latent  topics,  this  objective  has  recently  expanded  to  include 
attributing  people  to  topics.  These  user  models  are  required  to  determine  user  inter¬ 
ests. 


2. 3. 3. 5  Author  Topic.  The  first  such  model  is  the  Author-Topic 
(AT)  model  described  by  Rosen-Zvi,  et  al.  [?].  They  introduce  a  new  author  vari¬ 
able.  There  are  P  authors  in  the  corpus.  Each  document  contains  a  subset  of  people 
who  authored  the  document.  AT  [?]  assumes  that  there  are  an  infinite  number  of 
multinomials  that  could  be  associated  with  picking  a  topic.  The  multinomial  that 
actually  emerges  from  the  corpus  is  just  one  possibility.  AT  explicitly  “picks”  one 
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collection  of  these  topic  probabilities  by  including  a  parameter,  0,  selected  a  priori  for 
a  dirichlet  distribution.  Then  0  is  drawn  from  this  distribution  producing  a  multino¬ 
mial  probability  for  topics  conditioned  on  users.  This  mechanism  prevents  AT  from 
being  constricted  by  the  corpus.  In  addition  to  0,  the  probability  distribution  for  top¬ 
ics,  a  second  parameter,  a  is  also  selected  a  priori  for  a  second  dirichlet  distribution. 
In  a  similar  manner  to  topics,  6  is  then  drawn  from  this  distribution  to  produce  a 
multinomial  distribution  for  each  word  conditioned  on  the  topic  (Figure  ??. 


Figure  2.6:  Author  Topic  Model 


Based  on  this  model,  in  order  to  create  an  email,  N  words  are  selected.  To 
pick  a  word,  an  author,  u,  is  chosen  uniformly  from  the  population  P.  Then  a  topic, 
z,  is  selected  conditioned  on  the  author  chosen.  Since  the  probability  of  picking  a 
topic  is  also  conditioned  on  0,  the  probability  of  picking  a  topic  is  p(z\u,<f>).  After 
the  topic  is  selected,  the  probability  of  picking  a  word  is  conditional  on  the  topic, 
z,  and  9.  Therefore,  the  probability  of  selecting  a  word  is  p(w\z,9).  However,  since 
a  word  could  be  selected  for  any  topic,  the  probability  of  each  word  is  actually  the 
probability  of  that  word  for  topic  1  plus  the  probability  of  that  word  for  topic  2  and 
so  on.  Similarly  the  probability  of  a  topic  must  also  be  considered  for  each  possible 
0.  Therefore,  the  generative  equation  for  an  email  is: 


N  K 


p(<i> \0)  (  y^,p(ur)  I  p{0\at)  ins  p(zj\<f>)p(wi\zj,9)  )  d9  )  d(j)  (2.12) 

q=i j= i 


,  r=l 
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and  a  generative  model  for  the  corpus  of: 


M 


'  Nd  K 


n  /  pw)  i  i«)  ms  p(zj\ct)d)p{wi\zj)6d)  )  d6d  )  d(f>d  (2.13) 

i=i  j= 1 


d=l 


v  r=l 


As  with  PLSI,  the  goal  is  to  compute  p(z,  0, 9\d,  a,  (5).  By  using  Bayes’  Rule, 
the  resulting  equation  is: 


p(4>,6,z\d}a,(3) 


p{(p,d,z,d\a,(3 ) 

p(d\a,P) 


(2.14) 


However,  this  model  suffers  from  the  same  intractability  problem  that  Blci,  et 
al.  suffered  from  in  LDA.  However  rather  than  resolving  this  problem  by  using  varia¬ 
tional  inference,  Rosen-Zvi,  et  al.  solved  it  by  using  Gibbs  sampling.  Gibb’s  Sampling 
works  by  randomly  assigning  words  to  users  and  topics  and  then  finding  the  resulting 
conditional  probabilities.  This  process  is  then  repeated  until  the  conditional  proba¬ 
bilities  converge.  Recall  that  based  on  the  model,  a  user  is  selected  uniformly.  Then 
a  topic  is  selected  conditioned  on  that  user  and  then  a  word  is  selected  conditioned 
on  that  topic.  Therefore,  since  with  a  corpus,  we  start  with  words  and  users,  it  is 
necessary  to  work  backwards.  By  looking  at  the  number  of  times  a  user  has  been 
assigned  to  a  particular  topic,  one  infers  the  probability  of  that  topic  given  that  user 
(i.e.  p(z\u)  =  n(u,z)/n(u )  where  n(u,z )  is  the  number  of  times  topic  z  was  assigned 
to  user  u  and  n(u)  is  the  number  of  times  user  u  occurs  in  the  corpus  (i.e.  the  number 
of  times  user  u  is  assigned  to  any  word).  Similarly,  one  infers  the  number  of  times 
a  word  was  chosen  for  a  given  topic  (i.e.  p{w\z)  =  n(w,z)/n(z )  where  n(w,z)  is  the 
number  of  times  word  w  was  assigned  to  topic  z  and  n(z)  is  the  number  of  times  topic 
z  occurs  in  the  corpus  (i.e.  the  number  of  times  topic  z  is  assigned  to  any  word). 
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Mathematically,  the  conditional  probabilities  are: 


n(u,  z)  +  a 


(2.15) 


(2.16) 

(2.17) 


Algorithmically: 

1.  Assign  random  probabilities  to  all  conditional  probabilities,  i.e. 
p(z\w)  and  p(u\w),  such  that  they  produce  a  probability  dis¬ 
tribution  (i.e.  the  probabilities  are  all  non-negative  and  sum 
to  one). 

2.  For  every  word  in  every  document,  “determine”  what  topic  and 
user  produced  it.  To  do  this,  pick  a  random  number  between 
0  and  1  and  see  which  conditional  probability  it  falls  into. 

3.  Based  on  the  number  of  times  each  user  and  topic  was  assigned 
to  a  word,  re-calculate  the  conditional  probabilities. 

4.  Repeat  steps  2  and  3  until  convergence. 

While  it  is  possible  to  estimate  a  and  /3,  following  Rosen-Zvi,  et  al.  [?],  a  is  set 

to  50/K  and  (3  is  set  to  0.01. 

While  Rosen-Zvi,  et  al.  created  their  generative  model  to  describe  the  creation 
of  papers  for  scientific  journals,  it  also  applies  without  adjustment  to  the  creation 
of  emails.  If  we  are  only  interested  in  the  people  who  send  emails,  then  the  model 
considers  each  document  as  having  only  a  single  author.  If  instead,  recipients  of 
emails  are  also  included  in  the  model,  than  “authors”  expand  to  mean  all  the  people 
associated  with  the  email,  both  senders  and  recipients.  Very  recently,  McCallum,  et 
al.  [?]  observed  that  using  this  model  doesn’t  differentiate  between  the  senders  and 
recipients.  For  instance,  senior  level  personnel  often  have  email  discussions  amongst 
themselves  but  blind  carbon  copy  their  assistants  to  perform  some  related  task.  In  this 
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case,  it  is  desirable  to  segregate  the  different  recipients  by  their  roles.  McCallum,  et 
al.  accomplished  this  by  taking  the  Author  Topic  model  and  separating  out  recipients 
from  authors  (Figure  ??).  Now  each  message  is  composed  of  only  one  author  but 
potentially  multiple  recipients.  In  addition,  the  selection  of  a  topic  is  now  based  on 
both  the  author  and  the  recipient  chosen  for  the  given  word.  A  final  model  also 
described  in  [?]  includes  adding  additional  variables  for  roles  (Figure  ??).  In  this 
case  each  author  has  one  or  more  roles  associated  with  him.  Once  a  recipient  is 
selected,  roles  for  the  author  and  the  recipient  are  selected  conditioned  on  the  author 
and  recipient  chosen.  These  roles  are  then  used  to  select  topics  which  are  then  used 
to  select  words. 


Figure  2.7:  Author  Recipient  Topic  Model 


Figure  2.8:  Role  Author  Recipient  Topic  Model 

When  considering  recipients,  depending  on  the  application,  one  may  wish  to 
include  or  exclude  those  people  carbon  copied  and/or  blind  carbon  copied.  Both  the 
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Author  Topic  (AT)  model  and  the  Author  Recipient  Topic  (ART)  model  do  include 
recipients.  In  order  to  select  between  them,  it  is  important  to  consider  whether  the 
person  who  sends  the  message  is  qualitatively  different  from  the  one  who  receives 
it.  McCallum,  et  al.  argue  that  by  differentiating  between  senders  and  receivers,  it 
is  possible  to  extract  the  different  roles  that  these  people  play.  While  this  certainly 
makes  sense,  it  does  not  apply  to  the  particular  application  of  only  extracting  people’s 
interests.  Whether  a  person  sends  an  email  about  going  to  a  basketball  game  or 
receives  an  email  about  it,  in  both  cases  it  is  reasonable  to  assume  that  the  person  is 
interested  in  basketball.  Therefore,  for  clustering  people  interests  via  email  the  AT 
model  is  more  appropriate. 

2. 3. 3. 6  PLSI-U.  While  LDA  has  been  expanded  to  address  users 
through  the  AT  model,  PLSI  has  not.  However,  the  expansion  of  PLSI  to  include 
users  is  fairly  straightforward.  To  begin,  recall  that  Bayes  rule  states  that  p(a,  b )  = 
p(a\b)  ■ p{b )  =  p(b\a)  ■ p(a ).  Therefore  when  considering  p(z\u,  d ,  w)  where  u  represents 
a  person: 


p{z\u,  d,  w)p(u,  d,  w)  =  p(u,  d,w\z)p(z ) 


(2.18) 


Now,  consider  the  model  in  Figure  ??  and  observe  that  u,  d,  and  w  are  all  conditionally 
independent  given  0  (this  is  a  subtle  difference  from  the  Author  Topic  model  where 
z  is  dependent  on  u  and  w  is  dependent  on  z).  It  now  follows  that: 


p{z\u,  d ,  w )  = 


p(u\z)p(d\z)p(w\z)p(z) 


p(u,  d,  w) 

But  p(u,d,w )  is  simply  p(u,d,w\z)  marginalized  across  all  possible  z’s.  So  finally, 


(2.19) 


p(z\u,  d ,  w )  = 


p(u\ 

\z)p(d\z)p(w\ 

\z)p(z) 

\z')p(d\z')p(w\ 

\z')p(z ') 

(2.20) 
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In  order  to  evaluate  the  conditional  probabilities  in  the  above  equations,  consider: 


p{w\z) 


p(z\w)p(w ) 

pM 


(2.21) 


By  marginalizing  across  u  and  d,  we  get: 


E ugu  EdgpPOK  w)p(w\u,  d) 

Eugt/  'Tld&D  E«/gVV  p(z\u,d,  W1) 


(2.22) 


Finally,  consider  what  p(w\u,  d)  means.  This  is  the  probability  of  a  given  word  occur¬ 
ring  for  a  given  document  and  person.  Since  the  document  and  person  are  already 
specified  the  probability  space  is  the  one  document.  Therefore  the  probability  is  the 
number  of  times  the  word  appears  in  the  document  divided  by  the  number  of  words 
in  the  document.  Therefore: 


P  W~  E„gpEdgpE™w^M,<Hd,u;) 

where  n(d,w)  is  the  number  of  times  a  word  occurs  in  a  document.  Observe  that 
since  a  document  is  the  same  regardless  of  which  “author”  is  considered,  it  is  sufficient 
to  specify  n(d,  w )  so  long  as  it  is  summed  across  all  people.  Furthermore,  since  the 
denominator  sums  across  all  words,  the  net  effect  is  the  quotient  described  previously. 
This  equation  extends  naturally  to  documents  and  users: 


(2.23) 


p(d\z) 

p(u\z) 


Tnve D  p(z K  d,  w)n(d,  w) 

T,ueu  Y.d'zD  P(z K  d'i  w)n{d,  w ) 

Edgp  E^g w  P(z K  d i  w)n(d,  w) 

E„'g u  Edgp  E^g w  P(z\u'i  d’  w)n(d, w ) 


P(z)  =  p(z\u’ 

u&U  d£D  w€W 


(2.24) 

(2.25) 

(2.26) 


These  equations  now  form  the  expectation  (eq.  ??)  and  maximization  (eq.  ??,  eq. 
??,  eq.  ??,  eq.  ??)  equations  for  Expectation-Maximization  (EM).  EM  alternates 
two  steps: 
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1.  Assign  random  probabilities  to  p(d\z),  p(w\z),  p{u\z),  and  p(z) 
such  that  they  produce  probability  distributions  (the  probabil¬ 
ities  are  all  non- negative  and  sum  to  one). 

2.  Calculate  all  of  the  values  for  p{z\u,  d,  w). 

3.  Using  the  values  from  step  2,  calculate  the  new  values  of  p(d\z), 
p(w\z),  p{u\z),  and  p(z). 

4.  Repeat  steps  2  and  3  until  convergence. 


Figure  2.9:  PLSI-User  Mixture  Model 

Only  one  more  point  needs  discussing  before  moving  to  the  implementation  of 
these  models.  The  Author  Topic  model  focuses  on  the  relationship  between  authors 
and  categories  and  categories  and  words.  Other  than  acknowledgment  that  these 
words  and  authors  are  combined  into  documents,  documents  are  not  explicitly  in¬ 
cluded  in  the  development  of  the  model.  However,  since  it  is  necessary  to  classify  the 
emails  in  order  to  create  the  explicit  network,  the  probability  of  a  category  given  a 
document  needs  development.  Observing  that  for  a  document  to  exist,  each  word  in 
the  document  must  exist,  it  follows: 

p(d\z)  =  n  p{w\z)  (2.27) 

w£d 

The  one  problem  with  this  is  that  for  documents  containing  many  words,  the  condi¬ 
tional  probability  of  any  document  will  quickly  go  to  zero.  Even  if  the  conditional 
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probabilities  are  later  normalized  to  construct  a  probability  distribution,  if  too  much 
precision  is  lost  first,  such  a  calculation  won’t  work.  To  avoid  this  problem,  and  notic¬ 
ing  that  the  probabilities  will  be  normalized  anyway,  the  logarithm  of  the  function  is 
taken.  Furthermore,  to  prevent  the  problem  of  logO,  the  function  is  multiplied  by  eN . 
The  result  is: 

p{d\z)  oc  log^(l  +  p(w\z))  (2.28) 

wad 

2-4  Examining  Social  Networks 

When  the  clustering  algorithms  complete,  three  things  are  produced:  the  most 
probable  words  for  a  topic  (highest  value  of  p(w\z)),  the  most  probable  individuals 
for  a  topic  (highest  value  of  p{u\z)),  and  the  most  probable  documents  for  a  topic 
(p(d\z)).  Topics  are  then  considered  as  topics  of  interest  for  an  individual  if  p(z\u) 
exceeds  a  certain  threshold.  An  individual’s  profile  is  then  the  collection  of  all  of  his 
topics  of  interest. 

Once  profiles  have  been  generated,  the  final  step  is  to  use  these  profiles  to  find 
likely  insider  threats.  The  method  used  for  this  research  is  performed  in  three  steps. 
The  first  step  is  to  use  these  profiles  to  find  individuals  that  have  an  interest  in 
sensitive  topics.  The  second  and  third  step  depend  on  using  these  profiles  to  develop 
social  networks.  The  second  step  uses  social  networks  to  find  individuals  that  feel 
alienated  from  the  organization.  Finally,  once  an  insider  is  known,  the  third  step  uses 
social  networks  to  find  other  individuals  with  common  interests. 

2-4-1  Social  Network  Attributes.  A  social  network  is  “a  finite  set  or  sets  of 
actors  and  the  relation  or  relations  defined  on  them”  [?].  Actors  are  either  individ¬ 
uals  or  groups  of  individuals  and  the  relations  between  them  are  any  form  of  social 
relationship.  For  this  paper,  actors  refer  to  individuals  and  two  actors  are  considered 
related  if  they  have  at  least  one  interest  in  common.  Consider  Figure  ??.  There  are 
several  ways  to  construct  this  graph.  Obviously  each  vertex  represents  one  actor  or 
person  but  it  is  not  clear  how  to  represent  a  shared  interest.  For  example,  if  John 
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is  very  interested  in  sports  and  Mary  has  a  small  interest  in  it,  do  they  share  this 
interest?  On  the  other  hand,  if  Susan  and  Mike  share  two  interests  like  tennis  and 
skydiving,  should  they  be  connected  by  two  edges  or  should  they  only  be  connected 
by  one  edge  with  a  much  higher  weight  [?]? 


Figure  2.10:  Example  of  a  Social  Network  Graph 


Table  2.1:  Example  of  an  Adjacency  Matrix 


John  Steve 

Mary 

Susan 

Dave 

Diane 

Mike  Ellen 

Jeff  Stacy 

Grace 

Chuck 

John 

1 

1 

Steve 

Mary 

1 

1 

Susan 

1 

1 

1 

2 

Dave 

1 

1 

Diane 

1 

2 

Mike 

2 

2 

1 

Ellen 

1 

Jeff 

5 

1 

Stacy 

5 

1 

Grace 

1 

Chuck 

1 

Once  the  graph  has  been  constructed,  the  associated  adjacency  matrix  (Ta¬ 
ble  ??)  is  used  to  determine  the  number  of  paths  between  people.  The  adjacency 
matrix  itself  shows  the  number  of  paths  of  length  1  between  people,  i.e.  the  number 
of  relationships  that  people  have  with  each  other.  This  is  also  the  number  of  edges 
between  vertices.  The  square  of  the  adjacency  matrix  (i.e.  the  adjacency  matrix  mul¬ 
tiplied  by  itself)  shows  the  number  of  paths  of  length  2  between  people,  i.e.  how  many 
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“friends”  people  have  in  common.  This  extends  to  paths  of  any  length  by  continuing 
to  multiply  the  adjacency  matrix  by  itself  [?].  The  number  of  paths  between  people 
is  used  for  many  different  purposes.  In  many  applications,  this  knowledge  identifies 
the  “power  brokers”,  the  people  on  the  most  paths  between  people.  There  are  several 
methods  used  to  measure  an  actor’s  importance. 

These  individual  measurements  can  then  be  aggregated  to  arrive  at  a  measure 
of  the  network’s  cohesiveness.  Consider  the  three  graph  types  shown  in  Figure  ??. 
In  the  Star  graph,  everyone  is  clustered  around  one  actor.  In  addition  to  this  one 
actor  being  very  important,  the  overall  network  is  also  very  effective  at  disseminating 
information  quickly  since  everyone  is  within  a  distance  of  at  least  two  of  everyone 
else.  Unlike  the  Star  Graph,  disseminating  information  in  the  Circle  and  Line  graphs 
may  take  significantly  longer  since  the  distance  between  two  arbitrary  people  may 
be  quite  large.  However,  while  information  may  take  longer  to  spread  in  the  Line 
and  Circle  graph,  they  do  appear  more  “fair”  since  roughly  everyone  is  equally  well- 
connected.  Also,  removing  one  actor  from  the  Line  graph  does  halt  information  flow, 
while  removing  the  wrong  actor  from  the  Star  graph  disconnects  everyone. 


o 


o - o 

Star  graph  Circle  graph  Line  graph 

Figure  2.11:  Some  representative  Social  Network  Graphs 

2-4-2  Centrality  Measurements.  Several  measurements  that  are  used  in  so¬ 
cial  network  analysis  to  describe  an  individual’s  importance  or  centrality  are:  degree, 
closeness,  and  betweenness  [?].  Degree  is  the  simplest  measurement  of  centrality  and 
assumes  that  the  most  central  actors  must  be  connected  to  the  most  other  actors. 
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The  normalized  form  of  this  measurement  is: 


C'M 


d(vi) 

(N~  1) 


(2.29) 


where  C'd  is  the  normalized  measurement  of  degree  centralization,  Vi  is  a  vertex  (actor) 
in  the  network,  d{vj)  is  the  degree  of  the  vertex  and  N  is  the  number  of  actors  in  the 
network.  For  the  Star  graph,  the  resulting  values  are  1.0  for  the  central  actor  and  0.167 
for  each  of  the  other  actors.  The  values  for  all  of  the  actors  in  the  Circle  graph  are  0.333 
while  the  two  on  the  end  for  the  Line  graph  are  0.167.  Actor  degree  centralization 
can  be  extended  to  the  overall  network  several  ways,  all  of  which  measure  dispersion 
in  some  way.  One  common  formula  [?]  is: 


Ef=i (Cd(v*)  -  cd(Vi)) 

(N  —  1)(N  —  2) 


(2.30) 


where  Cd(v*)  is  the  maximum  degree  centralization  for  the  network.  Another  mea¬ 
surement  using  degree  is  the  density  of  a  network.  Density  measures  how  many 
connections  exist  between  individuals.  The  more  individuals  who  communicate,  the 
denser  the  graph  is.  Density  is  measured  by: 


density  =  TLA 


(2.31) 


The  density  for  the  sample  graphs  are:  Star  graph:  0.286,  Circle  graph:  0.333,  Line: 
0.286.  Observe  that  the  density  of  the  complete  graph  is  1  and  the  empty  graph  is  0. 

A  second  measurement  of  centrality  is  closeness  and  measures  the  distance  of  an 
actor  from  the  other  actors  in  the  network.  It  does  this  by  calculating  the  distance  of 
the  shortest  path  between  a  vertex  (actor)  and  all  of  the  other  vertices  in  the  network. 
The  equation  for  normalized  closeness  centralization  is: 


CM  = 


N-l 


EL  d{vi,Vj) 


(2.32) 
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where  d(vi,Vj )  is  the  distance  of  the  shortest  path  from  Vi  to  vr  Observe  that  this 
measurement  assumes  a  connected  network  since  if  the  network  is  disconnected  then 
at  least  one  distance  is  infinite.  There  are  a  couple  of  ways  of  overcoming  this  problem. 
One  is  to  set  the  distance  between  two  vertices  without  a  path  between  them  to  one 
more  than  the  longest  possible  shortest  path.  A  second  is  to  perform  this  measurement 
on  the  largest  component  of  the  network.  While  in  general  this  would  not  make  sense, 
if  the  network  has  only  one  large  cluster,  e.g.  containing  98%  of  the  vertices,  then 
such  a  measurement  is  a  good  heuristic  for  the  closeness  measurement.  For  the  Star 
graph,  the  closeness  centralization  of  the  central  actor  is  1.0  while  it  is  0.545  for  all 
of  the  other  actors.  For  the  Circle  graph,  the  measurement  of  all  of  the  actors  is  0.5. 
Finally,  for  the  Line  graph,  the  measurement  is  0.50  for  all  of  the  actors  except  the 
two  on  the  end  for  which  it  is  0.286.  In  the  similar  manner  as  degree,  closeness  can 
be  extended  to  the  graph  as  follows: 

r  _  EUc'M-c'M) 

c  (N-2)(N-  l)/(27V-3)  1  J 

where  C'c(v*)  is  the  largest  closeness  centralization  measurement.  For  the  Star  graph, 
this  measurement  is  1.0  signifying  that  there  is  one  actor  connected  to  everyone  else. 
Conversely,  the  Circle  graph  achieves  the  measurement  of  0  signifying  that  all  shortest 
path  distances  are  equal.  Finally,  the  Line  graph  has  a  relatively  small  value  of  0.277. 

The  last  centralization  measurement  to  discuss  is  betweenness.  This  is  a  very 
expensive  measurement  to  calculate  as  it  measures  the  number  of  shortest  paths 
between  vertices  that  pass  through  a  third  vertex  compared  to  the  total  number  of 
shortest  paths  between  vertices.  Rather  than  use  the  Star,  Circle,  or  Line  graphs, 
consider  the  graph  in  Figure  ??.  A  is  on  shortest  paths  from  G  to  F,  G  to  C,  G  to 
.D,  and  G  to  E.  However,  in  each  case,  there  is  another  shortest  path  that  does  not 
involve  A  (namely  the  one  through  B).  Therefore,  Cb(A)  =  0.5  +  0.5  +  0.5  +  0.5  =  2. 
Finally,  this  result  can  be  normalized  by  dividing  by  the  maximum  number  of  shortest 
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paths  through  a  vertex  (  (N  —  1)(N  —  2)/2).  The  equation,  then,  is: 


(2.34) 


where  ast  is  the  number  of  shortest  paths  from  s  to  t  and  <Jst(v )  is  the  number 
of  shortest  paths  from  s  to  t  through  v.  For  the  sample  graph,  the  results  are: 


This  normalized  value  of  this  is: 


C'(vl)  =  Cb(vi)/((N-l)(N-2)/2) 


(2.35) 


One  advantage  of  this  measurement  over  closeness  is  that  it  can  be  computed  exactly 
even  for  disconnected  graphs.  Unfortunately,  one  disadvantage  has  been  a  computa¬ 
tion  cost  of  Q(n3).  However,  Brandes  [?]  has  recently  developed  an  algorithm  that 
can  compute  betweenness  for  unweighted  graphs  in  O(NM)  where  M  is  the  number 
of  edges.  For  dense  graphs,  where  M  can  be  as  large  as  (N  —  1)(N  —  2)/2,  this  isn’t 
helpful.  However,  in  sparse  graphs,  this  is  a  huge  timesavings.  Finally,  centralization 
can  also  be  extended  to  networks  similarly  to  degree  and  closeness  as: 


Cb 


(2.36) 


N-  1 


E 


o 


Figure  2.12:  An  example  Social  Network  Graph  to  measure  Betweenness 
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2-4-3  Current  Social  Network  Research  That  Uses  Email.  It  is  important  to 
remember  that  the  entire  social  network  is  constructed  based  on  perceived  common 
interests.  This  could  be  further  enhanced  by  adding  edges  between  people  who  already 
know  each  other.  This  can  be  done  a  priori  by  an  expert  and/  or  automatically  by 
examining  the  email  headers  of  each  person.  While  using  an  expert  to  do  it  requires 
more  effort,  it  does  have  the  added  benefit  of  allowing  the  shared  interest  (i.e.  why 
do  the  two  people  know  each  other)  encoded  in  the  link.  Once  this  augmented  graph 
has  been  constructed,  it  is  used  to  infer  other  clandestine  networks.  Liben-Knowell 
and  Kleinberg  [?]  show  that  several  proximity  measurements  predict  additional  links 
about  40  to  50  times  better  than  pure  chance.  These  measurements  include  the 
shortest  path  between  two  vertices,  the  total  number  of  distinct  paths  between  two 
vertices,  the  length  of  a  random  walk  between  two  vertices,  and  the  degrees  of  two 
vertices. 

There  already  exist  tools  that  develop  social  networks  from  email  information. 
ReferralWeb  [?]  uses  the  co-occurrence  of  names  in  close  proximity  in  any  document 
publicly  available  on  the  WWW  (e.g.  journal  articles,  newsgroups,  chatrooms  etc.) 
to  denote  a  close  relationship.  Lada  Adamic  has  done  similar  research  [?]  by  using 
mailing  lists  and  the  homepages  of  students  at  Stanford  and  MIT.  Since,  when  people 
create  homepages,  they  link  to  their  friends’  homepages  (and  ask  their  friends  to 
link  to  theirs),  she  postulated  that  using  homepages  would  result  in  an  appropriate 
social  network.  She  also  used  the  text  present  on  the  web  pages  to  further  predict 
relationships  (i.e.  common  interests)  between  people.  While  she  was  able  to  show 
that  the  text  provided  strong  indications  of  friendships  between  people,  it  is  unclear  if 
this  would  generalize  beyond  the  rather  closed  community  of  a  university.  Culotta,  et 
al.  [?]  approached  this  problem  differently  but  from  a  more  general  population.  They 
began  by  extracting  names  from  email  messages.  Then  they  used  the  WWW  to  find 
the  person’s  “web  presence”  (generally  his  or  her  homepage)  and  used  that  to  describe 
the  person  and  to  find  friends  of  that  person.  After  the  network  was  created,  they 
used  graph  partitioning  algorithms  to  find  highly  connected  components.  While  their 
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dataset  was  small  (53  email  correspondents),  their  results  were  promising.  However, 
one  of  the  biggest  drawbacks  was  a  lack  of  web  presence  for  many  of  the  correspondents 
(31  of  53).  Although  it  is  likely  that  over  time  more  and  more  people  will  have  a  web 
presence  (e.g.  young  adults  are  much  more  likely  to  create  home  pages  than  people 
in  their  forties  and  fifties),  it  will  probably  still  be  a  while  before  web  presence  is  a 
reliable  means  of  predicting  relationships. 

Since  September  11,  2001  there  has  been  increased  research  in  uncovering  poten¬ 
tial  threats  through  the  use  of  social  networks.  However,  despite  several  organization, 
such  as  Rand  Corporation  and  Mitre,  making  proposals  for  using  social  networks  to 
detect  insider  threats  [?,?],  little  public  research  has  been  done.  Symonenko  [?]  has 
generated  social  networks  of  intelligence  analysts  and  then  used  semantic  analysis  to 
detect  when  individuals  are  showing  interests  in  areas  outside  of  their  group.  While 
the  results  have  been  promising,  the  technique  requires  a  large  amount  of  interviews 
with  experts  to  provide  the  semantic  analysis.  In  addition,  this  expert  knowledge  is 
then  only  applicable  to  the  specific  group  and  needs  to  be  repeated  each  time  the 
application  is  moved  to  another  organization.  Yee  [?]  has  also  performed  some  initial 
research  into  generating  social  networks  from  email  headers  for  later  analysis  by  social 
network  analysts. 

The  actual  methods  used  to  develop  social  networks  from  email  activity  are 
discussed  in  more  detail  in  Chapter  ??  and  the  actual  email  corpus  used  in  testing 
is  discussed  in  the  following  section. 

2.5  Using  Enron’s  Email  as  Source  Data 

Electronic  mail  is  fast  becoming  the  most  common  form  of  communication.  By 
2006,  email  traffic  is  expected  to  exceed  over  60  billion  messages  daily  [?].  While 
email  was  always  expected  to  increase  communication  between  people  in  different 
cities  and  countries,  it  is  unexpected  how  much  people  in  the  same  building  and  even 
the  same  room  communicate  via  email.  Not  only  is  email  used  to  inform  others,  it  is 
also  used  to  inform  one’s  self.  It  is  becoming  more  and  more  common  for  people  to 
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send  email  to  themselves  as  reminders.  In  addition  to  these  true  information  emails, 
unwanted  email,  also  known  as  spam,  has  evolved  from  an  academic  curiosity,  to  a 
minor  nuisance,  to  a  significant  business  problem.  Spam  accounts  for  over  half  of  all 
messages  sent  [?].  In  addition  to  relatively  innocuous  spam,  in  2004,  at  least  eight 
out  of  the  ten  most  frequently  reported  computer  worms  were  delivered  via  email. 

Using  email  as  data  for  research  is  not  new.  However,  the  primary  objectives  of 
this  specific  kind  of  research  have  changed.  While  earlier  topics  like  email  categoriza¬ 
tion  [?]  and  spam  filtering  [?,?]  continue  to  be  actively  researched,  new  topics  have 
emerged  as  well.  Two  years  ago  a  conference  (Conference  on  Email  and  AntiSpam 
(CEAS)  )  was  created  at  Stanford  that  just  focused  on  email  and  span  (www.ceas.cc). 
In  addition  to  spam  filtering  and  email  categorization,  its  papers  have  focused  on  top¬ 
ics  like  extracting  social  networks  from  email  [?,?],  inferring  user  activity  from  email 
[?],  and  extracting  text  features  [?]. 

In  addition,  email  is  also  beginning  to  emerge  as  a  tool  for  detecting  deceptive 
communications  [?].  While  there  has  been  a  large  amount  of  research  in  preventing 
incoming  mail  that  is  deemed  suspicious  [?],  the  idea  of  reading  the  outgoing  mail 
has  not  received  a  lot  of  activity.  This  is  due  in  large  part  to  privacy  concerns  and 
the  lack  of  large-scale  email  datasets.  While  Keila  and  Skillicorn  focus  on  the  use  of 
email  features  to  detect  deceptive  emails,  it  is  equally  reasonable  to  use  the  semantic 
content  [?,?].  Semantic  analysis,  i.e.  extracting  meaning  from  text,  has  been  directly 
applied  to  countering  insider  threats  by  Symonenko,  et  al.  [?].  They  investigated 
the  effectiveness  of  using  natural  language  processing  (NLP)  to  discover  intelligence 
analysts  who  were  accessing  information  outside  of  their  community  of  interest. 

2.5.1  Privacy  Concerns.  The  benefits  of  finding  better  ways  to  protect 
against  threats  must  be  balanced  with  respecting  people’s  privacy.  Despite  signing 
consents  to  monitoring,  people  still  have  the  right  to  expect  to  come  to  work  without 
having  someone  rifle  through  their  desk  [?].  In  this  new  electronic  age,  the  same 
consideration  is  given  to  people’s  electronic  desks  (i.e.  their  computers,  personal  data 
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assistants,  et  cetera)  as  to  their  physical  ones.  This  is  even  more  true  when  the 
purpose  of  monitoring  is  not  to  detect  a  real  threat  but  only  to  perform  research. 
Therefore,  before  gathering  electronic  data,  investigators  must  first  consider  ways  to 
protect  people’s  privacy.  One  possibility  is  to  have  everyone  sign  a  consent  form. 
While  this  has  been  done  for  small  scale  studies  (e.g.  1-20  people),  it  is  unrealistic 
when  the  goal  is  to  study  an  organization  with  personnel  in  the  thousands.  The 
researcher  must  consider  other  options. 

One  method  often  used  to  address  privacy  concerns  is  stripping  out  people’s 
names  from  email  headers  and  replacing  them  with  unique  identifiers  [?].  In  this 
manner,  it  is  possible  to  develop  social  networks  without  the  chance  of  someone’s 
privacy  being  violated.  Unfortunately,  since  the  research  is  often  based  on  only  the 
email  headers,  the  same  clean  up  is  not  done  for  the  email  text  itself.  In  many  ways 
this  problem  is  harder  since  it  is  possible  to  refer  to  someone  by  many  different  names 
(e.g.  Robert  Smith,  Rob,  Bob,  the  boss,  the  President).  One  way  to  address  this  is 
to  simply  remove  all  proper  names  from  email  as  well  as  some  titles  (e.g.  boss,  dean, 
etc.).  Unfortunately,  this  may  also  strip  out  words  that  are  helpful  to  categorize  the 
email.  For  instance,  it  may  be  desirable  when  looking  through  email  to  categorize 
based  on  words  like  “George  Bush”,  “al  Quaeda”,  and  ‘’Osama  bin  Laden’’. 

There  is  another  alternative  to  artificially  generated  email  and  real  world  email 
with  its  associated  privacy  concerns.  During  the  investigation  into  the  Enron  scan¬ 
dal,  the  Federal  Energy  Regulatory  Commission  (FERC)  made  email  from  Enron 
publicly  available.  As  a  part  of  this  process,  it  placed  the  email  of  151,  primar¬ 
ily  senior,  employees  of  Enron  accessible  electronically  on  the  World  Wide  Web 
(http://fercic.aspensys.com/niembers/manager.asp).  This  data  was  later  taken  by 
researchers  at  several  institutions  and  organized  and  cleansed  of  some  integrity  prob¬ 
lems  before  being  assembled  into  a  useable  format.  The  data  is  now  available  as 
a  series  of  text  documents  from  Carnegie  Melon  University  [?]  and  as  a  MySQL 
database  from  University  of  California  at  Berkeley  [?].  The  data  (after  cleanup) 
consists  of  email  from  151  people’s  folders  (although  it  seems  that  two  people  appear 
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twice  with  different  user  names  and  one  person  only  sent  automated  calendar  re¬ 
minders)  comprising  250,484  unique  messages.  While  the  data  has  not  been  available 
for  long,  it  is  beginning  to  come  to  the  attention  of  researchers  as  a  unique  source  of 
data.  Some  of  the  earliest  work  using  the  Enron  data  includes  analysis  of  automatic 
email  classification  into  folders  [?,?],  extracting  hidden  (i.e.  forwarded)  emails  from 
emails  [?],  determining  email  response  times  [?]  and  investigating  the  structure  of 
words  in  email  [?],  as  well  as  developing  social  networks  of  Enron  [?,?]. 

In  this  thesis,  the  email  bodies  from  the  Enron  corpus  are  used  to  develop  user 
interest  profiles.  These  user  profiles  are  then  used  to  develop  an  implicit  social  network 
based  on  interests.  In  addition,  a  second,  explicit,  social  network  is  constructed  from 
the  email  headers.  This  social  network  maps  the  explicit  links  between  people.  It 
is  posited  that  while  people  may  not  communicate  with  all  of  the  people  who  have 
the  same  interests,  they  will  talk  to  only  people  with  whom  they  share  at  least  one 
interest.  If  people  appear  to  have  an  interest  but  do  not  talk  to  anyone  with  a  similar 
interest,  this  may  be  suggestive  of  a  hidden  interest.  While  most  hidden  interests  are 
likely  innocuous,  their  discovery  may  indicate  the  need  for  more  specific  attention  by 
people  like  their  supervisors  or  security.  The  data  sources  and  the  techniques  used  to 
extract  the  relevant  information  is  discussed  in  more  detail  in  Chapter  ??. 

2.5.2  A  Brief  Synopsis  of  Enron.  Since  this  thesis  is  based  on  the  Enron 
data,  a  little  background  on  the  events  contributing  to  demise  of  Enron  is  helpful. 
Ken  Lay  entered  into  the  natural  gas  industry  back  in  the  1980s  because  he  believed 
that  the  industry  would  soon  become  deregulated.  It  did  and  he  was  able  to  create 
a  successful  company  by  riding  the  wave  of  natural  gas  deregulation.  In  1990  Lay 
brought  in  Jeff  Skilling  to  run  a  new  business  within  Enron  focused  on  selling  long 
term  fixed  price  contracts  for  natural  gas  to  regional  distributors  and  purchasing  long 
term  fixed  price  contract  from  natural  gas  refineries.  The  profit  came  from  the  dif¬ 
ference  between  the  two  contracts,  modelled  in  part  like  bank  mortgages.  Banks  loan 
money  to  people  (gas  refineries)  to  buy  a  house  and  collect  monthly  payments  from 
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mortgagees  and  other  customers  (regional  distributors).  One  of  the  original  things 
Skilling  did  was  implement  “mark-to-market”  accounting  within  this  new  business. 
This  means  that  the  total  profit  is  recognized  at  the  time  the  contract  is  purchased 
from  the  refinery  by  estimating  what  it  well  sell  for  during  the  next  twenty  years. 

Of  course,  if  the  actual  profits  are  different,  the  additional  profit  (or  loss)  must 
be  recognized  in  full  as  soon  as  the  company  becomes  aware  of  it.  Therefore,  to  avoid 
reporting  losses  it  is  important  to  lock  in  the  gains.  When  Skilling  first  asked  Andy 
Fastow,  the  Chief  Financial  Officer,  for  a  way  to  lock  in  these  gains,  Fastow  turned  to 
the  Research  Group  for  answers.  He  was  told  by  its  head,  Vince  Kaminski,  that  there 
was  no  way  to  fully  hedge  risk.  However,  Fastow  refused  to  take  no  for  an  answer.  He 
continued  to  investigate  and  eventually  decided  that  he  had  discovered  a  way  to  lock 
in  gains  (although  he  chose  not  to  ask  the  Research  Group  for  confirmation)  by  using 
off-book-partnerships.  The  current  General  Accounting  Office  (GAO)  laws  allowed  a 
new  company,  we  will  call  X,  to  be  considered  independent  even  if  97%  of  the  company 
is  owned  by  company  A  (Enron)  so  long  as  the  last  3%  is  owned  by  an  independent 
company.  This  new  company,  company  X,  then  provides  insurance  to  company  A 
guaranteeing  that  company  A  would  never  lose  any  of  the  profits  recognized.  This 
mechanism  can  also  be  used  to  allow  company  X  to  purchase  assets  from  company  A 
for  more  than  they  are  worth.  While  in  reality  nothing  has  changed,  on  the  Profit  and 
Loss  statement  provided  to  the  shareholders  of  company  A,  everything  looks  excellent. 

One  thing  that  made  this  practice  worse  was  that  although  company  X  was 
supposed  to  be  independent  of  company  A,  in  Enron’s  case  it  wasn’t.  The  3%  that 
was  supposed  to  be  independent  was  owned  either  by  Andy  Fastow  or  another  Enron 
employee,  Michael  Kopper.  They  managed  the  companies  (  known  by  various  names 
such  as  LJM  and  Raptor)  and  ensured  their  stakes  were  recovered  through  manage¬ 
ment  fees  paid  by  Enron  (along  with  tens  of  millions  of  dollars  more  unbeknownst 
to  Lay  and  Skilling).  In  the  end,  Enron  was  the  only  one  with  a  financial  interest 
in  these  companies.  The  final  card  in  the  house  of  cards  was  how  all  of  this  was 
financed.  Lay,  Skilling,  Fastow,  Kopper,  and  everyone  else  believed  that  Enron  stock 
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would  continue  to  go  up.  They  therefore  used  this  “fact”  to  guarantee  the  profits 
by  selling  Enron  stock  to  company  X  for  capital.  If  Enron  stock  ever  went  below  a 
certain  level,  company  X  would  be  calling  on  Enron  to  provide  money  so  that  it  could 
pay  off  on  its  insurance  policy  to  Enron.  However  this  would  be  happening  at  the 
very  time  that  Enron  couldn’t  pay  company  X.  This  was  in  fact  what  happened  in 
late  2001. 

Other  things  contributed  to  Enron’s  downfall  including  the  negative  publicity 
surrounding  the  California  energy  crisis  of  1999  and  2000.  It  appears  that  Enron 
manipulated  demand  during  this  crisis  to  make  huge  profits.  They  received  a  lot  of 
bad  publicity  which  may  have  contributed  to  Wall  Street  analysts  finally  beginning 
to  dig  into  Enron’s  financials.  Also,  the  house  of  cards  based  on  hiding  losses  began 
to  crumble  as  the  next  “big  idea”  (in  this  case  Enron  Broadband  and  deregulation 
of  electricity  sales)  didn’t  happen.  Shortly  before  this  happened,  in  August  of  2001 
Sherron  Watkins,  an  employee  of  Fastow’s,  wrote  a  letter  to  Ken  Lay  to  explain  in 
detail  what  Fastow  was  doing.  While  this  letter  wasn’t  discovered  by  Lay  until  too 
late  (and  did  not  leave  Enron  until  long  after  Enron  declared  bankruptcy),  this  is  the 
best  example  of  an  insider  within  Enron  [?,?]. 

2. 6  Summary 

In  the  post-Cold  War,  continued  cases  of  state  and  military  espionage  have  been 
joined  by  economic  espionage  as  a  matter  of  national  security.  Despite  the  widespread 
prevalence  of  the  Internet,  and  the  public  attention  on  external  individuals  hacking 
into  computer  systems,  survey  after  survey  finds  that  the  greatest  threats  come  from 
within  organizations. 

For  information  to  be  useful,  it  must  be  accessible  by  the  right  people  at  the 
right  time  and  place.  Insiders  take  advantage  of  this  by  using  information  they  may 
have  a  legitimate  right  to  for  illegitimate  reasons.  Furthermore,  the  Information  Age 
with  its  increased  portability,  connectivity,  and  mobility  has  significantly  increased 
the  effectiveness  of  the  insider  [?].  In  the  overwhelming  majority  of  cases,  no  Ameri- 
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can  citizen  convicted  of  espionage  applied  for  the  job  with  the  intention  of  committing 
espionage.  Events  occurred  after  being  employed  in  a  sensitive  position  that  precipi¬ 
tated  the  actions. 

While  initial  personnel  reviews  uncover  factors  that  may  increase  the  possi¬ 
bilities  of  problems  in  the  future,  it  is  the  ongoing  observations  that  are  critical  to 
preventing  not  only  the  commission  of  acts  of  sabotage  and  theft  but  the  creation 
of  insiders.  First  rate  management  can  discover  subordinates  in  crisis  before  these 
personal  crises  escalate  into  state  crises.  By  recognizing  warning  signs,  showing  their 
genuine  concern  and  acting  on  it  to  provide  assistance  programs  to  people  in  crisis, 
management  has  the  opportunity  to  greatly  diminish  the  insider  threat.  And  in  cases 
where  people  don’t  respond  to  these  assistance  programs,  they  can  be  removed  from 
positions  where  they  can  cause  harm. 

Unfortunately,  with  the  increased  focus  on  lean  management,  there  is  less  time 
for  managers  to  get  to  know  their  people.  Instead,  there  is  a  greater  need  to  focus  on 
automated  techniques  to  highlight  those  areas  where  greater  attention  may  be  needed. 
In  today’s  organizations,  email  is  one  of  the  best  indicators  of  a  person’s  interests  and 
the  communities  he  is  a  member  of.  By  analyzing  it  an  interest  profile  is  created  and 
if  it  matches  established  risk  profiles,  an  alert  is  generated. 

In  order  to  generate  these  profiles,  first  the  email  activity  needs  categorization. 
The  type  of  clustering  that  has  shown  the  greatest  promise  with  text  is  probabilistic 
clustering.  By  incorporating  the  existence  of  underlying  topics,  probabilistic  clus¬ 
tering  has  shown  great  promise  in  categorizing  text  documents.  PLSI  and  Author 
Topic  are  two  models  based  on  probabilistic  clustering.  By  applying  these  models  to 
email  activity  reasonable  clusters  are  developed  and  from  them  profiles  are  generated. 
Once  these  profiles  have  been  created,  social  networks  are  then  used  to  illustrate  com¬ 
mon  interests  and  to  predict  the  existence  of  unseen  relationships,  as  discussed  in  the 
following  chapters. 
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Privacy  concerns  prevent  the  use  of  real-world  datasets.  However,  with  the 
public  disclosure  of  Enron’s  email,  researchers  have  gained  a  unique  source  of  email 
data  for  analysis.  Specifically,  for  this  research,  it  is  hoped  that  whistleblower  Sherron 
Watkins  will  emerge  as  a  potential  insider. 
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III.  Methodology 

With  87  percent  of  all  theft  attributed  to  insiders,  the  need  to  identify  potential  insider 
thieves  before  they  steal  is  critical.  Furthermore,  given  that  the  current  business 
model  puts  more  emphasis  on  working  managers,  there  is  a  definite  need  for  developing 
automated  tools  to  at  least  give  managers  a  direction  of  where  to  focus  their  time. 
What  is  needed  is  to  determine  which  techniques  are  most  effective  at  detecting 
potential  insider  threats.  Two  techniques  that  have  shown  promise  at  extracting 
information  from  documents  are  Probabilistic  Latent  Semantic  Indexing  (PLSI)  and 
Author  Topic.  By  applying  them  to  the  Enron  email  corpus  and  determining  their 
usefulness  at  extracting  potential  insider  threats,  it  is  possible  to  determine  how 
effective  they  are.  They  will  be  successful  if  they  can  be  (1)  generate  clear  topics  and 
(2)  find  individuals  who  have  clandestine  interests  in  those  topics.  First,  individuals 
who  emerge  as  hiding  their  interests  in  sensitive  topics  show  the  potential  to  be  insider 
threats.  Second,  individuals  who  tend  not  share  their  interests  with  people  within  the 
organization  may  feel  alienated  and  have  a  reduced  sense  of  loyalty  to  the  company. 
These  individuals  also  have  a  greater  potential  to  become  insider  threats. 

This  chapter  begins  by  covering  the  specific  research  question  being  tested,  fol¬ 
lowed  by  the  evaluation  metrics  used  to  determine  whether  or  not  the  question  has 
been  answered  affirmatively.  Next,  the  processing  of  the  data  is  discussed  includ¬ 
ing  the  data  itself  as  well  as  its  preparation,  clustering,  and  analysis.  The  chapter 
concludes  with  two  additional  experiments  that  are  run  to  see  if  any  additional  infor¬ 
mation  can  be  uncovered. 

3.1  Research  Questions 

The  original  goal  of  this  thesis  was  to  test  “Are  probabilistic  clustering  and  social 
networking  techniques  applied  to  email  and  internet  activity  effective  at  detecting 
potential  insider  threats?” .  While  this  is  the  question  with  which  the  thesis  began,  the 
scope  of  the  thesis  has  been  trimmed.  The  final  research  question  is,  “Are  probabilistic 
clustering  and  social  networking  techniques  applied  to  email  effective  at  detecting 
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potential  insider  threats?”  The  reason  for  the  scope  change  is  discussed  in  the  final 
chapter. 

3.2  Evaluation  Metrics 

In  order  to  consider  these  probabilistic  clustering  and  social  networking  tech¬ 
niques  useful,  they  must  be  valid,  useable,  and  timely  [?].  In  order  for  the  techniques 
to  be  valid,  they  must  reveal  either  potential  or  actual  insiders.  This  metric  is  a 
difficult  one  to  test  because  there  are  few  real  world  email  datasets  where  insiders  are 
known. 

In  the  case  of  Enron,  the  only  known  potential  insider  is  Sherron  Watkins. 
Therefore,  the  first  validity  metric  is  whether  or  not  the  techniques  reveal  Sherron 
Watkins  as  an  insider.  This  is  accomplished  in  two  steps.  First,  the  topic  which 
is  most  connected  to  the  off-book  partnerships  is  established.  Next,  the  individuals 
with  a  clandestine  interest  in  that  topic  are  determined.  For  the  technique  to  be  valid, 
Sherron  Watkins  must  emerge  as  being  one  of  those  individuals.  This  same  process  is 
then  repeated  for  the  topic  which  is  most  connected  to  socializing.  It  would  be  ideal  if 
it  were  possible  to  establish  that  other  Enron  insiders  shared  similar  interests  with  her. 
Unfortunately,  since  there  are  no  other  known  insiders,  this  is  not  possible.  Instead,  to 
test  whether  or  not  these  techniques  connect  people  who  should  have  similar  interests, 
two  additional  tests  are  performed.  The  first  test  checks  if  Kenneth  Lay  (chairman 
of  Enron),  Jeffrey  Skilling  (Enron’s  CEO),  and  Andrew  Fastow  (Enron’s  CFO  and  a 
manager  of  the  off-the-book  partnerships)  share  common  interests.  The  second  test 
checks  if  “L  JM”  and  “raptor”  emerge  in  a  category  that  appears  high  for  Andy  Fastow 
and/  or  Michael  Kopper  (a  manager  of  the  off-the-book  partnerships). 

To  test  useability,  several  things  must  occur.  First,  the  categories  that  emerge 
must  be  understandable,  i,e.  by  looking  at  the  most  significant  words  that  make  a 
topic,  a  general  sense  of  the  topic  must  emerge.  Second,  the  documents  that  the 
techniques  consider  most  representative  of  a  topic  must  actually  be  representative  of 
that  topic.  Similarly,  the  individuals  that  the  techniques  consider  most  associated 
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with  a  topic  must  also  actually  be  representative  of  that  topic.  While  measuring  each 
of  these  useability  metrics  is  much  more  subjective  than  measuring  the  two  validity 
metrics,  it  is  possible  to  make  a  qualitative  analysis  of  whether  or  not  these  techniques 
emerge  with  results  that  “make  sense” .  In  addition  to  these  three  metrics,  a  manage¬ 
able  number  of  people  with  clandestine  interests  must  also  emerge.  If  no  individuals 
emerge,  then  either  the  technique  is  running  on  data  from  the  perfect  organization 
or  potential  insiders  are  escaping  detection.  On  the  other  hand,  if  ten  percent  of 
the  employees  emerge  as  potential  insiders,  the  huge  number  of  false  positives  also 
makes  the  technique  unusable.  Therefore,  a  final  useability  metric  measures  success 
if  the  percentage  of  potential  insiders  extracted  is  between  0.1%  to  1.0%  percent  of 
the  population. 

Finally,  getting  valid  information  is  not  especially  helpful,  if  it  takes  months 
to  emerge.  At  the  same  time,  since  potential  insiders  emerge  over  time,  producing 
results  in  hours  is  not  necessary.  The  techniques  are  timely  if  they  complete  in  a  time 
measured  in  days.  Specifically,  if  the  techniques  complete  within  7  to  10  days,  they 
are  considered  useful. 

3. 3  Data  Processing 

Now  that  the  research  question  and  the  metrics  have  been  discussed,  the  next 
subject  to  review  is  the  data  that  the  techniques  will  be  tested  on.  In  2003,  as  part  of 
the  investigation  into  Enron,  the  Federal  Energy  Regulatory  Commission  made  public 
Enron’s  senior  management’s  email  activity  over  a  period  of  nine  months.  In  addition 
to  its  value  in  the  prosecution  of  the  case  against  Enron’s  senior  management,  this 
data  has  become  a  touchstone  of  research  into  email  datamining  techniques.  The  data 
was  originally  stored  as  scanned  images  as  well  as  .pdf  hies.  The  data  was  purchased 
by  Leslie  Kaelbling  at  MIT  who,  along  with  several  people  at  SRI,  cleaned  up  several 
data  integrity  issues.  After  this  cleanup,  William  Cohen  at  Carnegie  Melon  LIniversity 
received  the  data  and  organized  it  into  a  400  MB  tar  hie  consisting  of  500,000  messages 
from  the  folders  of  150  Enron  employees.  Andres  Corrada-Emmanucl  at  the  LIniversity 
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of  Massachusetts  performed  a  hash  test  on  this  data  and  determined  that  the  data 
actually  contained  two  copies  of  every  message.  The  newly  cleaned  data  was  then 
retrieved  by  Jitesh  Shetty  at  UC  Berkeley  who  organized  the  new,  reduced  collection 
of  approximately  250,000  messages  into  a  MySQL  database.  This  is  the  database 
used  for  this  thesis. 

3.3.1  Database.  The  database  consists  of  the  email  messages  and  some  hies 
for  categorization.  Since  the  categorization  hies  assume  a  hxed  number  of  categories 
and  are  only  populated  with  a  small  amount  of  test  data,  they  are  excluded.  The 
only  hies  used  are  the  ones  directly  related  to  the  email  messages.  This  includes  a 
Messages  hie  made  up  of  a  unique  messageid  (MessagelD),  the  date  (MessageDt)  and 
time  sent  (MessageTz),  the  subject  (Subject)  and  the  id  of  the  sender  (SenderlD);  a 
headers  hie  consisting  of  multiple  records  for  each  header  held  of  the  message;  and  a 
bodies  hie  consisting  of  a  single  record  containing  the  text  of  the  email. 

In  addition,  each  sender  and  recipient  is  assigned  a  unique  PersonID  which 
is  stored  in  the  People  hie.  This  PersonID  along  with  the  Recipients  hie  allows 
acessing  the  messages  from  either  the  recipient  or  the  sender  (see  Figure  ??).  Finally, 
a  mailgraph  hie  summarizes  the  number  of  correspondences  between  senders  and 
recipients. 

While  the  headers  hie  contains  valuable  SMTP-type  information,  the  only  infor¬ 
mation  used  for  the  clustering  algorithms  is  contained  in  the  People,  Recipients,  and 
Messages  tables.  Furthermore,  since  this  information  is  in  a  more  concise  form  with 
only  one  record  per  message,  the  information  is  processed  much  faster  by  excluding 
the  headers  hie. 

In  addition  to  these  hies,  there  are  several  more  hies  needed  to  perform  the 
clustering.  In  short,  hies  are  needed  to  capture  the  relationship  between  words  and 
documents.  The  minimum  hies  needed  to  perform  this  analysis  concisely  are  a  words 
hie  (Dictionary  or  Words)  that  creates  a  unique  identiher  for  each  word  and  tracks  the 
number  of  times  the  word  appears  in  the  corpus;  a  second  hie  that  tracks  how  many 
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Figure  3.1:  Relating  People  to  Messages  in  the  Enron  Corpus 

words  are  in  each  document  in  order  to  determine  the  relative  frequency  of  words  in 
a  document  (MessageDicts  or  Message  Words) .  Observe  that  words  either  are  created 
while  processing  the  data  hies  (represented  as  a  collection  of  letters  separated  by  non¬ 
letters  constitutes  a  word)  or  exist  a  priori  in  a  dictionary  and  words  in  the  documents 
are  validated  against  this  list  before  being  added  to  the  database. 

3.3.2  Preparation.  The  process  of  populating  the  supplemental  hies  is  fairly 
straightforward.  At  an  abstract  level,  all  that  is  required  is  to  look  at  the  body  of  each 
messsage,  count  the  number  of  times  each  word  appears,  and  load  this  information  in 
the  MessageDicts  (Message Words)  hie.  In  addition,  by  the  time  the  entire  corpus  is 
read,  the  number  of  times  each  word  appears  is  stored  in  the  Dictionary  (Words)  hie. 

What  makes  this  process  a  little  more  difficult  is  determining  what  constitutes 
a  word.  In  previous  work,  words  were  defined  strictly  as  collections  of  letters.  This 
means  that  phone  numbers  and  dates  are  not  stored.  In  addition,  since  email  addresses 
contain  non-alphabetic  characters,  they  also  are  not  considered  words.  While,  with 
a  looser  definition,  these  objects  can  certainly  be  considered  words  and  although 
they  would  serve  as  excellent  discriminatory  tools  for  clustering  like  messages,  this 
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thesis  continues  along  the  same  lines  of  previous  research  by  restricting  words  to 
containing  only  letters.  Each  time  a  non-alphabetic  character  occurs,  the  current 
word  is  considered  complete.  This  approach  was  selected  to  decrease  the  number 
of  words  in  the  corpus  so  that  (1)  the  algorithms  would  perform  faster  and  (2)  the 
smaller  word  space  would  result  in  more  general  clusters.  Furthermore,  a  dictionary 
[?]  was  retrieved  from  the  DEC  Systems  Research  Center  containing  over  104,000 
words.  The  proposed  words  from  the  corpus  are  checked  against  the  dictionary  before 
placement  in  the  database. 

A  second  technique  used  to  decrease  the  size  of  the  vocabulary  is  stemming. 
Stemming  removes  suffixes  from  words.  For  example  the  words  bake,  bakes,  baker, 
baking,  and  baked  all  refer  to  the  same  thing.  Rather  than  having  5  distinct  words 
in  the  vocabulary  (and  consequently  more  problems  clustering),  the  stem  “bak”  is 
used  for  all  of  them.  While  this  does  tend  to  make  some  words  more  difficult  to 
understand,  it  is  expected  that  the  most  prominent  words  in  the  clusters  are  complex 
enough  that  stemming  doesn’t  prevent  understanding  their  meaning.  The  code  to 
perform  the  stemming  was  written  in  C  by  Martin  Porter  [?]. 

Finally,  the  last  technique  used  to  improve  the  algorithms  performance  is  the 
removal  of  the  most  common  words  in  the  English  language.  Words  like  “the”  and 
“of”  (these  two  words  alone  account  for  9%  of  all  written  words)  are  not  helpful 
in  developing  clusters.  In  addition,  it  has  been  shown  [?,  ?]  that  languages  follow 
Zipf’s  Law.  The  frequency  distribution  of  words  can  be  described  using  an  inverse 
exponential  curve.  There  are  a  few  (approximately  500)  common  words,  that  appear 
with  very  high  probability.  There  are  more  average  words  (approximately  5,000),  that 
appear  with  a  reasonable  frequency.  Finally,  most  words  (approximately  50,000)  in  a 
language  are  rare  and  appear  with  a  very  small  frequency  (figure  ??).  By  removing 
the  most  common  words,  the  most  all-inclusive  clusters  are  removed  and  more  realistic 
clusters  emerge.  The  list  of  stop  words  was  retrieved  from  University  of  Neuchatel 
? 
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Figure  3.2:  Words  follow  an  inverse  power  law  distribution  [?] 


While  it  is  not  the  purpose  of  this  research  to  perform  clustering  by  using  tf-idf 
(term  frequency-inverse  document  frequency)  [?],  it  is  possible  to  use  the  information 
extracted  to  perform  this  analysis  without  any  additional  effort  simply  by  assuming 
multiple  copies  of  each  document  corresponding  to  the  number  of  people  associated 
with  it. 

3.3.3  Clustering.  For  PLSI-U,  the  code  was  written  in  C  from  scratch  using 
the  description  provided  by  Hoffman  for  PLSI  [?]  and  the  expectation  maximization 
algorithm  described  by  Moon  [?]  along  with  the  extensions  described  in  Section  ?? 
to  include  users.  The  input  parameters  include  the  number  of  categories,  the  value 
of  the  mean  square  error  that  constitutes  convergence,  and  the  maximum  number  of 
iterations.  The  number  of  categories  is  set  to  48  for  software  and  hardware  reasons. 
The  maximum  number  of  iterations  is  initially  chosen  to  be  80.  There  is  no  reason  to 
assume  that  80  iterations  is  sufficient.  However,  after  running  the  algorithm,  the  data 
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consistently  converged  prior  to  80  iterations.  As  a  result,  80  is  selected  as  a  sufficient 
number  of  iterations.  One  problem  with  PLSI-U  is  its  long  run-time.  For  instance, 
the  Enron  corpus  has  254,904  documents,  87,395  users,  and  using  a  stemmed  dictio¬ 
nary  54,147  words.  Combining  this  with  48  categories,  and  noting  that  4  calculations 
are  performed,  means  each  iteration  includes  approximately  2  x  1017  calculations.  In 
addition,  since  every  conditional  probability  (p(u\z),p(w\z),p(d\z),p(z\d,w,u),p(z)) 
needs  to  be  stored,  the  memory  requirements  are  similarly  staggering.  Several  tech¬ 
niques  are  used  to  decrease  both  the  memory  and  processing  requirements.  First,  by 
keeping  track  of  the  old  and  new  conditional  probabilities,  it  is  possible  to  avoid  storing 
p(z\d,u,w)  entirely.  The  resulting  memory  savings  more  than  offset  the  additional 
processing  requirements.  In  addition,  by  using  the  Enron  documents  to  determine 
what  words  and  users  need  updating  (instead  of  updating  conditional  probabilities 
for  which  there  is  no  data),  the  processing  requirements  are  also  significantly  reduced. 
Finally,  by  arranging  the  categories  as  the  outermost  loop,  it  is  possible  to  parallelize 
the  program  and  process  each  category  for  a  given  iteration  in  parallel,  updating 
the  values  between  parallel  processes  before  proceeding  to  the  next  iteration  (needed 
for  the  denominators  which  sum  across  categories).  The  need  for  parallel  processing 
drives  the  determination  of  48  categories.  The  tests  are  run  on  a  server  farm  where 
only  a  maximum  of  sixteen  processors  are  available  to  a  single  user.  Furthermore,  due 
to  the  memory  allocation  on  the  shared  servers,  running  more  then  three  processes 
on  any  server  causes  the  application  to  crash.  As  a  result,  16  x  3  =  48  categories  are 
chosen. 

While  the  Latent  Dirichlet  Allocation  (LDA)  code  is  available  for  public  use 
from  [?],  the  Author  Topic  (AT)  extensions  to  it  are  not.  Furthermore,  since  Rosen- 
Zvi,  et  al.  suggest  using  Gibb’s  sampling  while  Blei’s  LDA  code  used  variational 
inference,  the  C  code  for  AT  was  generated  from  scratch.  The  inputted  parameters 
include  the  number  of  topics  and  the  number  of  iterations  needed  for  convergence. 
The  number  of  topics  is  chosen  as  48  to  match  the  PLSI-U  tests.  The  number  of 
iterations  selected  is  2000  based  on  Rosen- Zvi’s  results  [?].  The  output  consists  of 
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two  files.  One  file  includes  the  number  of  times  each  word  is  assigned  to  each  topic 
and  the  other  hie  includes  the  number  of  times  each  user  is  paired  with  a  particular 
topic.  These  two  pieces  of  data  are  sufficient  to  calculate  the  desired  probabilities. 

In  order  to  perform  the  clustering,  the  data  is  extracted  from  the  Enron  database 
and  placed  into  a  texthle  (Figure  ??  A).  The  texthle  has  one  line  for  each  message  and 
each  line  is  composed  of  the  number  of  words  in  the  email  message.  This  is  followed  by 
the  number  of  people  connected  to  the  message.  Finally,  the  line  concludes  with  a  list 
of  the  identification  numbers  of  the  individuals  followed  by  a  list  of  wordId:frequency 
pairs.  From  this  data,  the  clustering  programs  are  able  to  extract  clusters  and  output 
(1)  the  probability  that  a  user  is  in  a  particular  category,  (2)  a  document  is  in  a 
particular  category,  (3)  a  word  occurs  in  a  particular  document,  and  (4)  a  category 
occurs.  They  do  this  iteratively;  AT  creates  a  hie  every  100  iterations,  while  PLSI-U 
creates  a  hie  every  iteration  resulting  in  20  and  80  hies  respectively  (Figure  ??  B). 
The  values  in  the  hnal  iterations  hie  are  then  loaded  back  into  the  database  for  later 
analysis  (figure  ??  C). 

The  hnal  step  is  to  normalize  these  probabilities  (Figure  ??  D).  While  the 
conditional  probabilities  (e.g.  p(d,\z))  were  normalized  within  a  category  during  the 
PLSI-U  iterations,  they  now  are  normalized  for  a  given  document.  In  other  words, 
the  probability  that  email  message  1  is  topic  1,  topic  2,  ...,  topic  48  must  sum  to  1 
(e.g.  p(z\d)  must  be  normalized).  By  Bayes’  Rule: 

p{u\z)p{z) 

P(z\u)  =  - yw -  3.1 

P\u) 

Since  p{u\z)  and  p(z)  are  readily  available  and  p{u)  is  considered  uniform  (i.e.  the 
probability  of  any  user  being  selected  is  1  /  (Number  of  Users),  this  calculation  is 
easily  performed.  A  similar  calculation  is  performed  to  arrive  at  p(z\w)  and  p(z\d). 
Once  these  calculations  are  accomplished,  it  is  now  possible  to  observe  the  social 
networks  that  result. 
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3.3.4  Analysis.  After  data  clustering,  building  the  social  networks  is  straight¬ 
forward.  First,  an  implicit  network  is  constructed  from  the  PLSI-U  data.  Recall,  the 
implicit  network  is  composed  of  individuals  that  share  an  interest  in  the  same  topic. 
If  two  people  both  have  an  interest  in  a  topic  that  exceeds  a  threshold,  T,  a  link  is  cre¬ 
ated  between  those  two  people  (if  p(z  —  Zi\u  —  U\)  >  T  and  p(z  —  Zi\u  —  U2)  >  T, 
then  the  link  U\ U2  is  created  for  the  implicit  PLSI  network  for  category  Z\).  This 
process  is  repeated  for  every  pair  of  people  for  every  topic  creating  separate  graphs 
for  each  topic.  Once  the  PLSI-U  implicit  network  is  formed,  the  same  process  using 
the  AT  data  is  used  to  create  an  AT  implicit  network  (Figure  ??  E). 

Once  the  implicit  networks  are  formed,  a  final  process  creates  an  explicit  network 
based  on  email  data.  The  explicit  network  is  composed  of  edges  between  individuals 
who  have  emailed  one  another.  If  there  is  at  least  one  email  message  for  a  specific 
topic  between  two  people,  a  link  is  created  between  them.  Mathematically,  if  p(z  = 
Z]  \d  =  D\)  >  T,  then  Vfj,eD, the  link  U\U2  is  created  for  the  implicit  AT 
network  for  category  Z\ .  This  process  is  repeated  for  every  topic  and  every  pair  of 
individuals. 

What  is  unclear  is  how  to  determine  if  an  email  contains  the  relevant  topic. 
There  are  several  possible  ways  to  do  this.  One  is  to  set  a  probability  threshold.  For 
example  any  email  or  user  where  the  probability  of  the  relevant  category  given  that 
email  or  user  is  above  5%  is  considered  to  contain  an  interest  in  that  topic,  ffowever, 
this  gives  more  weight  to  popular  topics.  For  instance,  the  Author  Topic  algorithm 
results  in  one  category  (Category  0)  having  a  probability  for  most  users  of  over  90%. 
Using  a  straight  percentage  cutoff,  most  individuals  would  only  have  an  interest  in 
that  one  topic.  A  second  option  is  to  just  take  the  top  three  categories  for  all  users, 
regardless  of  percentages,  ffowever,  it  is  easy  to  construct  examples  where  this  also 
is  inappropriate.  What  seems  most  appropriate  is  to  take  the  average  probability  of 
a  category  and  set  the  threshold  at  a  certain  number  of  standard  deviations  above  it. 
For  instance,  if,  on  average,  individuals  have  a  0.5%  probability  of  being  interested  in 
category  43  with  a  standard  deviation  of  0.7%  and  someone  has  a  1.7%  probability  of 
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picking  that  topic,  then  (assuming  a  Gaussian  distribution)  that  individual  has  over 
a  95%  chance  of  being  interested  in  that  topic.  For  this  experiment,  1.64  standard 
deviations  above  the  mean  (resulting  in  probabilities  of  at  least  95%)  is  used  to  deter¬ 
mine  interest  in  a  topic  for  both  the  explicit  email  network  and  the  implicit  interest 
network. 

The  final  step  is  to  examine  the  implicit  and  explicit  social  networks.  Each 
implicit  network  is  compared  to  the  explicit  network  in  turn.  If  a  person  has  an 
interest  in  a  topic  demonstrated  by  the  presence  of  links  between  that  person  and 
others  in  the  implicit  network  but  has  no  links  to  anyone  in  the  explicit  network  for 
that  topic,  an  exception  is  generated.  For  instance  if  John,  Mary,  Steve,  and  Diane 
have  an  interest  in  dancing  but  John  has  neither  sent  nor  received  an  email  from  any 
of  them  about  dancing,  an  exception  is  generated  for  John  and  dancing  indicating  that 
John  has  a  hidden  interest  and  could  be  feeling  alienated.  This  process  is  repeated 
for  every  person  and  for  every  topic.  Figure  ??  shows  this  graphically. 

3-4  Additional  Experiments 

Two  additional  experiments  are  discussed  in  Chapter  ??.  The  first  concerns  the 
possibility  of  extracting  additional  topics.  As  observed  in  Section  ??,  due  to  hardware 
and  software  constraints,  the  number  of  topics  that  can  be  clustered  is  limited  to 
48.  However,  limiting  the  number  of  topics  to  48  may  lump  several  topics  into  one 
meta-topic.  Therefore,  once  the  data  clustering  and  analysis  is  performed,  additional 
analysis  is  performed  to  extract  documents  and  individuals  that  appear  most  probable 
for  a  specific  topic  and  perform  a  second  round  of  probabilistic  clustering  on  them 
to  achieve  a  finer  level  of  granularity.  The  ability  to  quickly  drill  down  and  produce 
a  finer  level  of  granularity  would  overcome  the  traditional  criticism  that  a  limited 
number  of  topics  can’t  produce  useful  topics. 

The  second  concerns  the  possibility  of  calculating  traditional  social  network 
analysis  (SNA)  metrics  from  the  social  networks  that  are  created  and  using  them  to 
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IMPLICT  INTEREST  NETWORKS 


Socializing 


Basketball 


Accounting 


EXPLICIT  INTEREST  NETWORKS 

Socializing  Basketballl  Accounting 


Figure  3.3:  An  Example  of  Clandestine  Interests  (implicit  network  =  external; 

internal  email  explicit  network  =  internal  email  only) 


aid  in  the  generation  of  additional  potential  insider  threats.  Once  the  networks  are 
developed  there  are  several  additional  questions  to  answer: 

•  What  are  the  values  of  various  social  network  centrality  indicators  including 
degree,  closeness  and  betweenness?  Are  the  individuals  with  high  centrality 
measurements  for  a  given  topic  “appropriate”  for  that  topic.  Said  differently, 
how  do  centrality  measurements  compare  with  the  conditional  probability  of 
an  individuals  given  a  topic?  Traditional  SNA  centrality  measurements  would 
provide  an  additional  validation  to  researchers  that  individuals  who  show  up 
both  via  probabilistic  clustering  and  SNA  are  worthy  of  additional  attention. 

•  Can  anything  about  people’s  positions  be  inferred  by  their  email  activity?  For 
instance: 


63 


—  Are  there  individuals  who  have  more  emails  sent  to  and  received  from 
people  outside  of  the  company  rather  than  inside  the  company.  This  may 
indicate  sales  people  or  purchasing  agents.  At  Enron,  this  may  also  have 
indicated  people  involved  in  investor  relations  or  public  relations. 

—  Are  there  some  people  who  appear  to  have  a  lot  of  interests.  This  may 
indicate  administrative  assistants. 

If  it  is  found  that  certain  SNA  measurements  suggest  certain  job  responsibilities, 
then  finding  people  with  those  SNA  measurements  that  do  not  have  those  job 
responsibilities  may  provide  another  indicator  of  a  potential  insider  threat. 

•  Can  anything  be  inferred  by  comparing  the  individual  explicit  networks  for  each 
category  to  a  consolidated  explicit  network  composed  of  all  categories?  Said 
differently,  what  is  the  relationship  between  the  individual  explicit  networks 
and  a  network  that  links  people  if  an  email  passed  between  them  regardless  of 
topic? 

•  Finally,  is  it  possible  to  follow  a  topic  as  it  progresses  through  the  organization? 
Topics  small  enough  in  size  to  perform  this  analysis  may  include  Jeff  Skilling’s 
resignation  and  Sherron  Watkins’  letter  to  Ken  Lay.  Such  an  analysis  on  a  topic 
of  interest  may  provide  an  additional  means  of  finding  either  potential  or  actual 
collaborators. 

3. 5  Summary 

In  order  to  find  potential  insiders,  it  is  first  necessary  to  get  to  know  them.  Ide¬ 
ally  this  is  done  personally.  However,  in  today’s  workplace,  this  may  not  be  possible. 
The  next  best  thing  is  to  do  it  electronically.  A  good  way  to  get  to  know  someone 
is  to  learn  about  their  interests  and  in  this  new  information  age,  email  is  one  of  the 
best  sources  of  electronic  information.  By  using  probabilistic  clustering  and  social 
networking  techniques,  specifically  Author  Topic  and  PLSI-U,  to  datamine  email,  a 
person’s  topics  of  interest  emerge.  There  are  several  possible  uses  for  these  topics 
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of  interests  to  generate  insider  threat  leads.  The  first  way  is  to  extract  people  as 
potential  insiders  who  have  an  interest  in  a  specific  topic  but  are  neither  the  recipient 
nor  the  sender  of  any  internal  email  that  contains  that  interest.  A  second  way  is 
by  finding  those  people  who  have  an  interest  in  a  topic  that  is  itself  an  indicator  of 
potential  insiders  (e.g.  money,  overseas  loyalties,  substance  abuse,  etc.).  Finally,  a 
third  way,  not  explored  in  this  work,  is  to  track  a  person’s  topics  of  interest  over  time 
and  watch  for  any  radical  changes. 

To  test  the  effectiveness  of  Author  Topic  and  PLSI-U  for  detecting  potential 
insider  threats  through  email  traffic,  the  Enron  email  corpus  is  broken  down  into  a 
collection  of  word  frequencies  and  user  assignments.  These  datapoints  are  then  input 
into  Author  Topic  and  PLSI-U  to  extract  topics  of  interest.  These  topics  of  interest 
are  then  applied  to  the  emails  to  find:  1)  the  words  that  describe  the  topics,  2) 
the  documents  and  people  that  are  the  most  representative  of  the  topics,  3)  and  the 
topics  that  best  describe  each  individual.  Finally,  these  are  reviewed  for  the  potential 
insiders.  The  results  of  these  experiments  are  described  in  the  next  chapter. 
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Figure  3.4:  Data  Clustering  Process  Flow 
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IV.  Results  and  Analysis 

As  discussed  previously,  the  goal  of  this  research  is  to  determine  the  applicability  of 
probabilistic  clustering  and  social  networking  techniques  applied  to  email  for  detecting 
potential  insider  threats.  This  is  determined  by  measuring  the  following  metrics: 

1.  Useability 

(a)  Do  the  algorithms  create  clusters  of  words  which  provide  a  clear  definition 
of  the  topic?  An  algorithm  performs  perfectly  if  all  of  the  most  probable 
words  describe  a  single  topic.  The  greater  the  number  of  outliers,  words 
that  do  not  describe  the  topic  described  by  a  majority  of  words,  the  poorer 
the  algorithm  performed.  If  no  majority  of  words  described  a  single  topic, 
the  algorithm  performs  poorly. 

(b)  Do  the  algorithms  create  clusters  of  individuals  which  match  the  topic  def¬ 
initions  described  by  the  words?  The  more  individuals  who  had  a  role  in 
the  topic,  the  better  the  algorithm  has  performed.  For  instance,  if  all  of 
the  most  probable  individuals  in  the  California  Crisis  topic  are  in  the  Gov¬ 
ernment  Relations  group,  the  algorithm  is  considered  to  have  performed 
well. 

(c)  Do  the  algorithms  create  clusters  of  documents  which  match  the  topic 
definitions  described  by  the  words?  In  the  same  manner  as  the  two  previous 
metrics,  if  the  documents  are  clearly  about  the  topic,  the  algorithm  is 
considered  to  have  performed  well. 

(d)  Do  the  algorithms  create  a  manageable  number  of  individuals  with  clan¬ 
destine  interests?  An  individual  is  considered  to  have  a  clandestine  interest 
if  he  has  an  interest  in  a  topic  but  does  not  send  any  emails  within  the 
company  about  that  topic.  The  number  of  individuals  is  considered  man¬ 
ageable  if  it  is  between  0.1%  and  1.0%  of  the  population. 

2.  Timeliness  -  Do  the  algorithms  find  potential  insider  threats  in  a  reasonable 
period  of  time  (7  -  10  days)? 
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3.  Validity 


(a)  Does  the  analysis  reveal  Sherron  Watkins,  the  famed  Enron  whistleblower, 
as  an  insider? 

(b)  Does  the  analysis  show  that  the  techniques  are  successful  at  finding  collab¬ 
orators  of  known  insiders  by  examining  common  interests?  The  first  test 
checks  if  Ken  Lay  (Enron’s  chairman),  Jeff  Skilling  (Enron’s  CEO  during 
most  of  Enron’s  questionable  business  activities),  and  Andy  Fastow  (En¬ 
ron’s  CFO  responsible  for  the  off-the-book  partnerships  that  contributed 
to  Enron’s  downfall)  have  common  interests?  The  second  test  check  if 
Andy  Fastow  and  Michael  Kopper  (the  two  principle  managers  of  the  off- 
the-book  partnerships)  both  have  a  strong  interest  in  the  category  about 
“Raptors”  (the  principle  transactions  of  the  off-the-book  partnerships)? 

After  the  results  of  these  metrics  are  discussed,  the  chapter  concludes  with  the  results 
of  the  two  additional  experiments.  First,  the  results  of  the  Two-Tiered  Approach 
are  discussed.  Finally,  the  results  of  the  traditional  Social  Network  Analysis  (SNA) 
analysis  is  reviewed. 

The  experiment  is  broken  down  into  eight  sub-experiments.  The  first  split  is 
between  words  in  the  dictionary  and  all  alphabetic  “words”  whether  or  not  they  are 
in  the  dictionary.  The  second  split  is  between  all  words  and  only  stemmed  words. 
The  final  split  divided  these  four  datasets  between  Author  Topic  and  PLSI-U.  After 
running  the  tests  on  the  stemmed  words,  it  is  clear  that  stemming  does  not  make  the 
identification  of  words  more  difficult.  In  addition,  increasing  the  size  of  the  corpus  by 
not  requiring  stemmed  words,  increases  the  size  of  the  joint  probability  distribution 
and  makes  it  more  difficult/  time-consuming  for  the  algorithms  to  converge.  As 
a  result,  the  experiment  only  checks  Author  Topic  and  PLSI-U  against  stemmed 
dictionary  words  and  all  stemmed  words. 

The  first  two  experiments  are  run  against  the  stemmed  dictionary  word  dataset 
while  the  third  and  fourth  are  run  against  the  stemmed  word  (no  dictionary)  dataset. 


After  the  algorithms  completes,  the  clusters  are  examined  to  see  if  they  “made  sense” . 
One  additional  concern  when  evaluating  topics  based  on  the  most  probable  words  must 
be  considered.  Despite  having  already  eliminated  stop  words,  there  are  other  words 
that  end  up  appearing  in  most  if  not  all  categories.  For  instance,  in  the  Enron  corpus, 
power  is  a  very  common  word  (as  is  Enron).  To  avoid  this  problem,  when  printing  the 
words  that  describe  a  category,  only  words  that  appeared  in  at  most  five  categories 
are  included. 

One  issue  that  surfaces  quickly  with  the  Author  Topic  data  is  the  existence  of 
a  category  that  most  individuals  and  emails  have  an  overwhelming  probability  for. 
When  the  category  is  investigated,  it  emerges  as  one  containing  words  like  meeting, 
regarding,  discuss,  issue,  and  schedule,  words  that  are  common  to  almost  any  busi¬ 
ness  email  (Figure  ??).  As  a  result,  this  category  is  removed  from  any  subsequent 
processing  and  the  conditional  probabilities  for  the  remaining  categories  are  adjusted 
to  pretend  the  category  does  not  exist. 

For  brevity,  only  four  topics  are  discussed  per  experiment  (Figures  ??,  ??,  ?? 
and  ??).  The  first  topic  (Senior  Mgmt)  is  about  the  senior  Enron  executives,  ft  is 
the  category  that  clustered  Ken  Lay,  Jeff  Skilling,  Greg  Whalley,  and  Jeff  McMahon. 
The  second  topic  (California  Crisis)  best  describes  the  California  energy  crisis.  The 
third  topic  (Resarch)  focuses  on  the  research  performed  by  Vince  Kaminski  and  his 
group.  Finally,  the  fourth  common  topic  (Info  Technology)  concerns  IT  systems  and 
help  desks.  In  addition,  Figure  ??  shows  several  additional  categories  (Scheduling, 
Sept  11  Attacks,  Personal  Email,  Enron  Crisis,  Fantasy  Football).  Although  the 
descriptions  of  these  topics  are  inferred,  they  are  listed  in  the  the  figures  for  ease  of 
understanding. 

4-1  Useability  Metrics 

In  order  to  be  useful,  probabilistic  clustering  and  social  networking  techniques 
must  generate  topics  that  are  easily  identifiable.  For  instance  when  attempting  to  find 
insiders,  although  it  is  some  help  to  know  that  John  and  Mary  share  in  an  interest  in 
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topic  17,  it  is  a  much  greater  help  if  it  is  known  that  topic  17  is  about  the  organization’s 
chief  competitor.  Therefore,  the  first  three  useability  metrics  examine  when  Author 
Topic  and  PLSI-U  create  when  clustered  topics.  First,  this  is  checked  by  looking 
at  the  most  probable  words;  second  it  is  checked  by  the  most  probable  individuals; 
and  third  it  is  checked  by  the  most  probable  documents.  The  final  useability  metric 
checks  the  number  of  individuals  that  emerge  as  having  clandestine  interests.  Too 
many  makes  the  results  unusable  while  too  few  mean  that  actual  insiders  may  be 
overlooked. 
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Lay 

1.4% 

Company 

1.3% 

Year 

1.2% 

Ken 

1.1% 

Opportunity  1.1% 

President 

1.0% 

Chairman 

1.0% 

Info  Technology 

CATEGORY  28 

Access 

2.3% 

Service 

2.1% 

User 

2.1% 

Information  1.6% 

System 

1.6% 

Contact 

1.6% 

Manage 

1.4% 

Server 

1.4% 

Thi 

1.4% 

California  Crisis 

CATEGORY  43 

Electricy 

1.8% 

Commission 

1.7% 

State 

1.6% 

Utility 

1.5% 

Energy 

1.2% 

Public 

1.1% 

Legislature 

1.1% 

Regulatory 

1.0% 

Senate 

0.9% 

Figure  4.2:  Author  Topic  Sample  Categories  using  Stemmed,  Dictionary  Words 

4-1.1  Describing  Topics  with  Words.  The  first  metric  considered  is  the  most 
probable  words.  The  topics  by  most  probable  words  are  in  Figures  ??,  ??,  ??,  ??, 
and  ??.  Although  complete  words  are  shown,  they  are  extrapolated  from  the  word 
stems  actually  produced.  Despite  initial  concerns  that  stemming  might  make  some 
of  the  words  difficult  to  determine  (e.g.  trying  to  determine  the  original  word  family 
that  stemmed  to  ‘thi’),  the  stemmed  words  that  distinguish  categories  prove  easy  to 
identify. 
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Senior  Mgmt 

CATEGORY  1 1 

PRC 

0.3% 

Video 

0.2% 

Weekly 

0.2% 

Ken 

0.2% 

Dial 

0.2% 

Kean 

0.2% 

Cindy 

0.2% 

VP 

0.1% 

Passcode 

0.1% 

California  Crisis 

CATEGORY  0 

Governor 

0.3% 

Calpin 

0.3% 

IEP 

0.3% 

Dasovich 

0.3% 

Edison 

0.3% 

Gov 

0.2% 

IEPA 

0.2% 

Duke 

0.2% 

Mara 

0.2% 

Research 

CATEGORY  44 

Vine 

1.3% 

Kaminski 

0.7% 

Research 

0.4% 

Model 

0.3% 

Shirley 

0.2% 

Rice 

0.2% 

Visit 

0.2% 

Crenshaw 

0.2% 

University 

0.2% 

Figure  4.3:  PLSI-U  Sample  Categories  using  Stemmed  Words  without  a  Dictionary 


Senior  Mgmt 

CATEGORY  8 

Develop 

0.7% 

Opportunity  0.6% 
Technology  0.5% 

Base 

0.5% 

Recent 

0.5% 

Success 

0.5% 

Generate 

0.4%. 

Experience 

0.4% 

Lead 

0.4% 

California  Crisis 

CATEGORY  18 

Jeff 

1.7% 

Dasovich 

1.7% 

Steff 

1.0% 

Mara 

0.8%. 

Shapiro 

0.7 % 

California 

0.1% 

James 

0.7% 

CA 

0.6% 

Gov 

0.5% 

Research 

CATEGORY  25 

Vince 

3.8% 

Kaminski 

3.2% 

Research 

1.0% 

Shirley 

0.8% 

VKamin 

0.7% 

Universe 

0.6% 

Crenshaw 

0.6% 

Stinson 

0.5% 

Model 

0.5% 

Info  Technology 

CATEGORY  7 

Unify 

0.1% 

SAP 

0.4% 

Netco 

0.3% 

Sitara 

0.3% 

Script 

0.3% 

Class 

0.2% 

Setup 

0.2% 

Path 

0.2% 

Regan 

0.2% 

without 

a  Die 

Info  Technology 

CATEGORY  10 

Password 

1.9% 

User 

1.9% 

Access 

1.8% 

Account 

1.5% 

ID 

0.9% 

Login 

0.8% 

Center 

0.7% 

Log 

0.7% 

Respond 

0.7% 

Figure  4.4:  Author  Topic  Sample  Categories  using  Stemmed  Words  without  a  Dic¬ 
tionary 


While  both  Author  Topic  and  PLSI-U  produce  categories  that  are  understable 
when  only  stemmed,  dictionary  words  are  used,  Author  Topic’s  categories  are  much 
easier  to  identify  and  do  not  share  many  words  in  common  with  PLSI-U  (see  Appendix 
??  for  the  listings  of  25  words  per  category  per  experiment).  This  does  not  persist 
when  stemmed  words  without  a  dictionary  is  the  dataset.  For  the  second  two  exper¬ 
iments,  the  descriptiveness  of  PLSI-U  is  at  least  equal  to  Author  Topic  and  possibly 
more  descriptive  due  to  its  more  extensive  use  of  acronyms.  Furthermore,  Author 


Scheduling 

CATEGORY  0 

Thi 

1.3% 

Subject 

1.2% 

Please 

1.1% 

Are 

1.1% 

Any 

0.8% 

Forward 

0.6% 

Call 

0.5% 

Attach 

0.5% 

Time 

0.5% 

Author  Topic 
w/  Dictionary 


Sept  1 1  Attacks 

CATEGORY  27 

Fool 

0.2% 

Attack 

0.1% 

Federal 

0.1% 

Bush 

0.1% 

Sector 

0.1% 

Assemby 

0.1% 

Pressure 

0.1% 

Flight 

0.1% 

Terrorist 

0.1% 

PLSI-U 
w/  Dictionary 


Personal  Email 

CATEGORY  22 

Yahoo 

2.1% 

Mime 

1.1% 

Hotmail 

1.0% 

Version 

0.9% 

Mailer 

0.8% 

Type 

0.7% 

MSN 

0.7% 

Return 

0.7% 

SMTP 

0.7% 

Author  Topic 
No  Dictionary 


Enron  Crisis 

CATEGORY  38 

Fund 

0.6% 

Consumer 

0.5% 

Stock 

0.4% 

Lay 

0.4% 

Ken 

0.4% 

Donation 

0.4% 

Retire 

0.4% 

Bankrupty 

0.4% 

Declare 

0.3% 

PLSI-U 
No  Dictionary 


Fantasy  Football 

CATEGORY  41 

Joe 

0.5% 

League 

0.5% 

Fantasy 

0.5% 

Team 

0.5% 

Football 

0.5% 

Game 

0.5% 

Commission  0.4% 

Season 

0.4% 

Player 

0.4% 

Author  Topic 
No  Dictionary 


Figure  4.5:  Additional  Categories  from  Author  Topic  and  PLSI-U  by  most  probable 
words 
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Topic  does  not  seem  to  change  much  in  the  descriptive  words  between  dictionary 
words  and  all  words.  Over  half  (29)  of  the  categories  are  easily  cross-referenced  be¬ 
tween  the  stemmed  dictionary  words  and  stemmed  words  (no  dictionary)  experiments 
for  Author  Topic. 

In  examining  the  topic  Senior  Mgrnt,  recall  that  it  is  determined  by  looking 
at  the  preferred  topics  of  Ken  Lay,  Jeff  Skilling  and  Andy  Fastow  and  selecting  the 
one  that  is  most  common  to  all  of  them.  While  there  is  only  a  little  overlap  in  the 
words  between  the  four  experiments  (Figures  ??  -  ??),  all  four  clearly  give  the 

reader  the  sense  of  a  senior  management  topic.  This  is  more  true  when  one  knows 
something  about  the  individuals  employed  at  Enron.  Sherri  Sera  was  Jeff  Skilling 
personal  assistant.  Mark  Palmer  was  head  of  Corporate  Communications.  Cindy 
Olson  was  head  of  Human  Resources  and  Steven  Kean  was  the  Chief  of  Staff.  All  of 
these  individuals  were  involved  in  the  weekly  management  meetings.  It  is  interesting 
to  observe  that  although  the  first  two  experiments  were  restricted  to  words  in  the 
dictionary,  some  names  seeped  through  if  their  stemmed  base  was  the  same  as  the 
stemmed  base  of  a  word  in  the  dictionary.  Examples  of  this  include  Ken  (ken)  Lay 
(lay),  Skilling  (skill),  Sherri  (sherry)  Sera  (sera  -  plural  of  serum),  and  Cindy  (cinder). 
In  light  of  this,  it  emerges  that  (at  least  for  the  first  category),  there  is  not  much 
difference  in  the  quality  of  results  between  experiments  where  the  words  are  restricted 
to  the  dictionary  and  experiments  where  they  are  not. 

Unlike  the  Senior  Mgmt  topic,  the  California  Crisis  topic  emerges  strictly  by 
examining  the  most  probable  words.  This  is  the  first  topic  where  one  observes  a 
significant  difference  in  the  quality  of  results  between  experiments.  Interestingly,  the 
highest  quality  results  come  from  two  experiments  that  are  the  most  different,  the 
Author  Topic  run  restricted  to  words  in  the  dictionary  and  the  PLSI-U  run  using  all 
words.  Here  is  where  one  sees  the  strengths  and  weaknesses  of  not  restricting  words 
to  the  dictionary.  When  the  Author  Topic  experiment  is  run  restricted  to  words  in 
the  dictionary,  a  topic  clearly  emerges.  However,  when  it  is  run  again  and  includes 
non-dictionary  words,  people’s  names  dilute  the  descriptiveness  of  the  topic.  On  the 
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other  hand,  PLSI-U  with  only  dictionary  words  barely  provides  enough  information  to 
provide  a  topic  description.  But  when  acronyms  are  allowed,  PLSI-U  adds  significant 
descriptiveness  by  adding  the  California  Planners  Information  Network  (CALPIN), 
the  Independent  Energy  Producers  Association  (IEPA)  and  Jeff  Dasovich  who  was 
Enron’s  representative  in  charge  of  government  affairs  in  California. 

The  Research  topic  at  first  glance  appears  to  show  a  mingling  of  two  topics, 
one  of  research  within  Enron  and  the  second  involving  universities  (possibly  recruit¬ 
ing).  However,  upon  closer  examination,  it  emerges  that  Vince  Kaminski,  head  of  the 
Research  Group,  had  a  close  relationship  with  the  faculty  at  Rice  University  (and  is 
currently  an  adjunct  professor  there).  He  and  several  of  his  employees  often  spoke 
there  and/or  invited  classes  to  Enron  for  research  projects.  As  a  result,  the  topic 
is  clearly  about  Enron’s  Research  Group.  Here  again,  names  and  email  addresses 
emerge  within  the  most  probable  words.  In  addition  to  Dr.  Kaminski,  Shirley  Cren¬ 
shaw  was  the  administrative  coordinator  for  the  Research  Group  and  Stinson  Gibner 
was  a  Vice  President  in  the  group.  With  this  topic,  all  of  the  experiments  produce 
quality  results  that  allow  a  topic  description  to  easily  emerge. 

The  Information  Technology  topic  is  the  last  that  is  consistent  throughout  all 
of  the  experiments.  While  some  of  the  words  such  as  directory,  hardware,  user,  and 
access  are  clearly  IT-type  words,  others  are  not.  However,  many  of  the  other  words 
are  actually  the  names  of  computer  programs  run  at  Enron.  SAP  was  their  Enterprise 
Resource  Planner.  Unify  was  the  deal  settlement  processing  system  and  Sitara  was 
the  system  used  to  complete  physical  gas  deals.  While  all  of  the  experiments  produce 
good  results,  the  experiments  not  restricted  to  words  in  the  dictionary  appear  to 
perform  better  since  they  are  able  to  include  the  names  of  software  packages.  Unlike 
the  previous  two  topics,  only  one  individual’s  name  appears  (Regan  Smith  was  a 
network  administrator)  resulting  in  a  poorer  quality  of  results. 

Finally,  additional  categories  provide  examples  of  other  topics  that  emerge  from 
these  algorithms.  Although  no  pornographic  categories  emerge,  Author  Topic  extracts 
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a  fantasy  football  league  that  appeared  fairly  active  at  Enron.  There  are  well  over 
7,000  internal  emails  sent  between  people  at  Enron  on  the  topic  of  Fantasty  Football. 
While  PLSI-U  never  extracts  a  recognizable  topic  on  fantasy  football,  it  does  extract  a 
recognizable  topic  on  the  September  11  attacks  and  one  on  the  last  days  of  Enron  prior 
to  the  bankruptcy.  An  additional  topic  that  emerges  is  focused  on  AOL,  Yahoo,  and 
Hotmail,  suggesting  that  many  individuals  may  have  accessed  their  personal  emails 
while  at  work.  Although  nothing  emerges  to  suggest  industrial  espionage,  the  use  of 
personal  emails  would  have  provided  an  excellent  mechanism  for  moving  secrets  out 
of  the  company. 

Both  PLSI-U  and  Author  Topic  clearly  succeed  at  the  useability  metric  for  most 
probable  words.  Both  PLSI-U  and  Author  Topic  work  well  at  extracting  coherent 
topics  from  the  email  corpus.  In  three  of  the  four  experiments  (the  exception  being 
PLSI-U  run  on  only  words  in  the  dictionary),  it  is  easy  to  describe  the  topics  based 
solely  on  the  twenty-five  most  probable  words  for  that  topic. 

4-1.2  Describing  Topics  with  Users.  The  topics  by  most  probable  individu¬ 
als  are  in  Figures  ??,  ??,  ??,  and  ??.  The  first  challenge  is  determining  individuals’ 
positions  within  Enron  since  it  is  not  contained  within  the  corpus.  Luckily,  there  is  a 
copious  amount  of  information  both  on  the  Internet  [?,?,?,?]  and  in  books  [?,?]  on 
the  rise  and  fall  of  Enron.  As  a  result,  despite  a  lack  of  job  information  in  the  actual 
corpus,  it  is  still  possible  to  determine  what  many  individuals’  positions  were.  One 
thing  that  emerges  is  the  number  of  different  email  addresses  possessed  by  certain 
individuals.  Vince  Kaminski,  Managing  Director  and  Head  of  Research,  had  at  least 
five.  Ken  Lay,  Chairman,  had  five  personal  email  addresses  and  at  least  as  many 
titular  email  addresses  (e.g.  Enron  Office  of  the  Chairman,  Ken  Lay  -  Office  of  the 
Chairman,  etc.).  This  degrades  the  quality  of  results  in  the  10  most  probable  individ¬ 
uals.  Consider  the  PLSI-U  without  a  dictionary  experiment.  Vince  Kaminski  appears 
in  the  Research  category  four  times  in  the  top  ten  individuals  under  different  email 
addresses  for  a  total  probability  of  30.04%.  For  identifying  or  connecting  topics  with 
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CATEGORY  45 

SENIOR  MGMT 

Steven  Kean 

Chief  of  Staff,  Govenment  Relations  Specialist 

5.89 

Stanley  Horton 

Chief  Executive  -  Enron  Transportation  Group 

4.05 

Steven  Kean 

Chief  of  Staff  -  Government  Relations  Specialist 

2.54 

Maureen  McVicker 

2.4% 

Rosalee  Fleming 

Secretery  to  Enron  Chairman  Kenneth  Lay 

2.19 

Greg  Whalley 

President  of  Enron 

1.8% 

Mark  Frevert 

Vice-Chairman  of  Enron 

1.6% 

Kenneth  Lay 

Chairman  of  Enron 

1.6% 

Cindy  Olson 

Head  of  Human  Resources 

1.5% 

Jeff  McMahon 

Chief  Financial  Officer  of  Enron 

1.3% 

CATEGORY  2 

CALIFORNIA  CRISIS 

Ken  Lay 

Chairman  of  Enron 

7.6% 

Karen  Denne 

Vice  President  of  Public  Relations 

6.7% 

Sandra  McCubbin  Director  of  Government  Affairs  in  California 

4.9% 

Paul  Kaufman 

Director  of  Government  Affairs 

3.9% 

Jeff  Dasovich 

Government  Affairs  Executive 

3.8% 

Harry  Kingerski 
Steven  Kean 

Chief  of  Staff,  Government  Relations  Specialist 

3.6% 

3.3% 

Mark  Palmer 

Head  of  Corporate  Communications 

3.2% 

Susan  Mara 

Director  of  Government  Affairs  in  California 

3.1% 

James  Steffes 

Vice  President  of  Government  Affairs 

2.8% 

CATEGORY  47  RESEARCH 

CATEGORY  40  INFO  TECHNOLOGY 

Vince  Kaminski  Managing  Director  and  Head  of  Research  34.1% 

Jeffrey  Shankman  Chief  Operating  Officer  for  Global  Markets  6.2% 

Shirley  Crenshaw  Research  Group  Adminstrative  Coordinator  5.0% 

Stinson  Gibner  Vice  President  in  Quantitative  Research  Group  4.0% 

Vasant  Shanbhogue  Vince  Kaminski’s  Second  in  Command  1.8% 

Tanya  Tamarchenko  Director  -  Value  at  Risk  1.5% 

Zimin  Lu  Director  of  Valuation  and  Trading  Analytics  Group  1.541' 

Jennifer  Bums  1.4% 

Grant  Masson  Vice  President -Research  Group  1.2% 

Pinnamaneni  Krishnarao  Vice  President  -  Research  Group  1.2% 

Lisa  Kinsey  1.0% 

Robert  Superty  Enron  North  America  -  Director  Gas  Procurement  1 .0% 
Patti  Sullivan  1.0% 

Daren  Farmer  Logistics  Manager  0.8% 

Victor  Lamadrid  0.8% 

Darla  Saucier  0.8% 

KirkLenart  0.7% 

Tammy  Gilmore  0.7% 

Cora  Pendergrass  0.7% 

Mark  Schrab  0.6% 

Figure  4.6:  PLSI-U  Individuals  Most  Associated  with  Sample  Categories  using 

Stemmed,  Dictionary  Words 


individuals,  PLSI-U  appears  to  produce  results  at  least  as  good  as  Author  Topic.  Of 
the  four  categories,  only  in  one  case  did  Author  Topic  perform  slightly  better. 

It  is  reasonable  that  the  Senior  Mgmt  topic  produces  good  results  since  it  is 
created  by  looking  at  specific  users.  The  results  from  PLSI-U  bear  this  out  producing 
Ken  Lay  (Chairman),  Greg  Whalley  (CEO),  and  Jeff  McMahon  (COO).  Also  very 
prominent  are  Enron’s  General  Counsel,  the  head  of  Human  Resources,  and  the  head 
of  Investor  Relations.  While  Author  Topic  also  produced  good  results,  it  is  lessened 
by  the  presence  of  generic  emails  (ethink?,  All  Enron  Gas  Services,  and  some  lower- 
level  individuals).  Overall  PLSI-U  significantly  out-performs  Author  Topic  both  when 
words  are  restricted  to  a  dictionary  and  when  the  words  are  unrestricted. 

PLSI-U  also  produces  superior  results  for  the  California  Crisis  topic.  It  extracts 
many  individuals  recognizably  involved  in  the  management  of  public  affairs  in  Cal¬ 
ifornia.  While  Author  Topic  produces  some  of  these  individuals  as  well,  there  are 
many  individuals  that  are  unidentified.  By  looking  at  their  emails,  they  do  appear  to 
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CATEGORY  43 

CALIFORNIA  CRISIS 

‘Kevin  Fulton 

0.1% 

Eric  Letke 

Enron  Energy  Services 

0.1% 

snovose 

0.1% 

Robert  Frank 

State  Government  Affairs 

0.1% 

Hap  Boyd 

Enron  Wind  Corporation 

0.1% 

.sue 

0.1% 

Tamara  Johnson 

0.1% 

tamara  Johnson 

0.1% 

Mark  Palmer 

Head  of  Corporate  Communications 

0.1% 

Becky  Merola 

0.1% 

CATEGORY  9 

SENIOR  MGMT 

Michael  Homing 

0.05% 

Jeff  McMahon 

Chief  Operating  Officer 

0.05% 

Anthony  Duenner 

Senior  Vice  Pres  Global  Assets  &  Services 

0.05% 

ethink 

0.05% 

Mitch  Meyer 

0.05% 

All  Enron  Worldwide 

0.05% 

Matthew  Scrimshaw 

0.04% 

Nate  Ellis 

Director  Enron  Energy  Services 

0.04% 

Mariano  Gomez 

0.04% 

Margaret  Carson 

Director  of  Corporate  Strategy 

0.04% 

CATEGORY  30  RESEARCH 

CATEGORY  28  INFO  TECHNOLOGY 

grant  Masson  Vice  President -Research  Group  0.04% 

Kenneth  Deng  Manager  of  Quantitatve  Research  0.04% 

Mary  Bailey  0,04% 

Vince  Kaminski  Managing  Director  and  Head  of  Research  0.04% 

Network  Security  0.04% 

Althea  Gordon  Recruiter  -  Associates/  Analysts  Program  0.03% 

Jason  Sokolov  Risk  Management  Group  employee  0.03% 

Lenos  Trigeorgis  Risk  Management  Group  employee  0.03% 

Rehman  Sharif  0.03% 

Nedre  Strambler  0.03% 

houston.report  0.1% 

EricSaibi  Enron  Capital  &  Trading  -  East  Desk  0.1% 

SAP  Security  0.1% 

EES  Power  Settlements  0.1% 

subscribers  @  mailman  0.1% 

weatherward  @  mailman  0.1% 

Integrated  Solutions  Center-  I/T  Help  Desk  0.1% 

Jeffrey  Jackson  0.1% 

ISC  Systems  Notification  0.1% 

Enron  Users  0.1% 

Figure  4.7:  Author  Topic  Individuals  Most  Associated  with  Sample  Categories 

using  Stemmed,  Dictionary  Words 


have  been  involved  in  conducting  business  in  California.  However,  what  their  exact 
positions  were  is  unknown.  As  a  result,  by  examination,  it  appears  that  PLSI-U  also 
out-performs  Author  Topic  in  this  category. 

The  Research  topic  differs  from  the  previous  two  by  its  limited  nature.  This 
topic  is  focused  on  a  relatively  small  group  within  the  Enron  corporation.  As  a  result, 
all  of  the  experiments  show  excellent  results.  This  is  despite  a  mix  of  small  and  large 
email  datasets  for  the  top  individuals.  This  suggests  that  when  attempting  to  find 
individuals  who  all  participate  in  a  category,  if  the  category  is  of  limited  interest,  then 
the  results  are  excellent. 

The  last  category  is  Information  Technology.  This  category  most  clearly  in¬ 
dicates  the  problem  of  attempting  to  identify  individuals  without  extensive  inside 
knowledge.  Only  one  I/T  professional  (Regan  Smith)  is  identified.  However,  Author 
Topic  also  extracts  email  ids  associated  with  software  packages  such  as  SAP,  ibuyit 
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CATEGORY  11 

SENIOR  MGMT 

James  Derrick 

General  Counsel 

3.60% 

Cindy  Olson 

Head  of  Human  Resources 

1,98% 

Kay  Chapman 

Secretary  of  Management  Committee 

1.679) 

Mark  Koenig 

Executive  Vice  President  of  Investor  Relations 

1.6751 

Greg  Whalley 

President  of  Enron 

1.66% 

Steven  Kean 

Chief  of  Staff,  Government  Relations  Specialist  1.57% 

Mark  Frevert 

Vice-Chairman  of  Enron 

1,549. 

Jeffrey  McMahon 

Chief  Financial  Officer  of  Enron 

1.46% 

Kenneth  Lay 

Chairman  of  Enron 

1,2591. 

David  Delainey 

Enron  Energy  Services  CEO 

1.199) 

CATEGORY  0  CALIFORNIA  CRISIS 

Jeff  Dasovich  Enron  Government  Affairs  Executive  1.46?) 

James  Wright  1.04% 

Richard  Sanders  VP  and  Asst  General  Counsel  for  Enron  Wholesale  0.88% 

Susan  Mara  Director  of  Government  Affairs  in  California  0.84% 

Scott  Stoness  0.83% 

Dennis  benevides  Director  of  Green  Power  for  Enron  Energy  in  CA  0.80% 1 
Sandra  McCubbin  Director  of  Government  Affairs  in  California  0.80% 
Richard  Shapiro  VP  of  Regulatory  Affairs  &  principal  DC  lobbyist  0.80% 
James  Steffes  Vice  President  of  Government  Affairs  0.76% 

Harry  Kingerski  0.76% 


CATEGORY  44  RESEARCH 

CATEGORY  7  INFO  TECHNOLOGY 

Vince  Kaminski  Managing  Director  and  Head  of  Research  20.88% 

Vince  Kaminski  Managing  Director  and  Head  of  Research  5.30% 

Shirley  Crenshaw  Research  Group  Adminstrative  Coordinator  3,49% 

Vince  Kaminski  Manager  Director  and  Head  of  Research  2.86% 

Stinson  Gibner  Vice  President  in  Quantitative  Research  Group  2.40% 

Don  Baughman  North  America  Power  trader -East  Desk  2.00% 

Vasant  Shanbhogue  Kaminski's  second  in  command  1.52% 

Zimin  Lu  Director  of  Valuation  and  Trading  Analytics  Group  1.0591. 

Eric  Bass  trader  1.03% 

Vince  Kaminski  Managing  Director  and  Head  of  Research  1.009) 

Daren  Farmer  Logistics  Manager  2.37 % 

Robert  Superty  Director -Gas  Procurement  Enron  North  America  1.825) 
Patti  Sullivan  1.54% 

Victor  Lamadrid  1.46% 

Lisa  Kinsey  1.38% 

Bryce  Baxter  1.20% 

Tammy  Jaquetr  1.04% 

Clarissa  Garcia  0.97% 

Regan  Smith  Network  Adminstrator  0.89% 

Kevin  Heal  0.87% 

Figure  4.8:  PLSI-U  Individuals  Most  Associated  with  Sample  Categories  using 

Stemmed  Words  without  a  Dictionary 


(an  Accounts  Payable  package),  and  the  Enron  help  desk.  For  this  category,  Author 
Topic  outperforms  PLSI-U. 

Overall,  both  PLSI-U  and  Author  Topic  performed  excellently,  consistently  pro¬ 
ducing  high  numbers  of  relevant  individuals.  Whether  the  words  are  restricted  to  the 
dictionary  or  not,  the  quality  of  results  is  unchanged.  Interestingly,  despite  the  greater 
simplicity  of  the  PLSI-U  generative  model,  it  performs  better  in  extracting  relevant 
individuals. 

Both  PLSI-U  and  Author  Topic  succeed  at  this  useability  metric.  If  the  goal  is 
to  find  additional  leads  given  a  topic  of  interest,  both  provide  very  appropriate  names 
associated  with  the  topics,  although  PLSI-U  appears  slightly  better.  Both  models 
produce  better  results  when  words  are  not  restricted  to  the  dictionary.  The  only 
exception  to  this  is  Author  Topic’s  tendency  to  include  names.  However,  this  is  easily 
remedied  by  excluding  names  from  the  analysis.  In  general  this  is  also  desirable  from 
a  privacy  perspective. 
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CATEGORY  8  SENIOR  MGMT 

CATEGORY  18  CALIFORNIA  CRISIS 

Enron  Office  of  the  Chairman  0.09% 

Jeff  McMahon  Chief  Operating  Officer  0.09% 

Ken  Lay  Enron  Chairman  0.09% 

All  Enron  Gas  Services  0.07% 

Enron  Operations  0.07% 

Nate  Ellis  Director  Enron  Energy  Services  0.07% 

Barabara  Taylor  0.06% 

All  Enron  Worldwide  0.06% 

Office  of  the  Chief  Executive  0.06% 

Office  of  the  Chairman  0.06% 

Deborah  Whitehead  0.14% 

Scott  Stoness  0.13% 

Leasa  Lopez  Lawyer  for  Enron  Energy  SErvices  0.13% 

Dave  Black  0.13% 

Ken  Gustafson  Enron  Wind  Corporation  0.13% 

JLewis  0.13% 

Tamara  Johnson  0.13% 

James  Steffes  Vice  President  of  Government  Affairs  0.12% 

Terry  Donovan  0.12% 

Tamara  Johnson  0.12% 

CATEGORY  25  RESEARCH 

West  Desk  Support  0.11% 

Julius  Zajda  0.11% 

Brad  Routine  Research  Group  employee  0.09% 

Tom  Barkley  Manager  in  Research  Group  0.09% 

Adam  Brulinski  0.08% 

Eloise  Meza  Research  Group  employee  0.08% 

Jason  Sokolov  Risk  Management  Group  employee  0.08% 

Kenneth  Parkhill  Research  Group  employee  0.07% 

Steve  Bigalow  Research  Group  technical  analyst  0.07% 

Bessik  Matchavariani  Manager  Enron  Broadband  Services  0.07% 


CATEGORY  10 

INFO  TECHNOLOGY 

ipayit 

Accounts  Payable  software 

0.14% 

SAP  Security 

0.14% 

enron.payroll 

0.14% 

payroli.enron 

0.13% 

ibuyit.payables 

E-Procurement  software 

0.13% 

payables.ibuyit 

E-Procurement  software 

0.13% 

enron.expertfinder  S/W  to  find  subject  matter  experts  withn  Enron 

0.12% 

mbx  iscinfra 

0.12% 

Tahnee  Stall 

0.11% 

ic 

0.11% 

Figure  4.9:  Author  Topic  Individuals  Most  Associated  with  Sample  Categories 

using  Stemmed  Words  without  a  Dictionary 


4-1.3  Describing  Topics  with  Documents.  After  considering  whether  topics 
make  sense  considering  most  probable  words  and  most  probable  users,  a  final  check  is 
to  consider  the  most  probable  documents.  Unfortunately,  the  results  applied  to  doc¬ 
uments  are  more  mixed.  Even  when  considering  the  most  probable  documents,  some 
are  clearly  related  to  the  topic  while  some  are  not.  This  may  be  due  to  the  relatively 
small  number  of  topics.  As  a  result,  (1)  the  topics  are  very  general,  containing  a  mix 
of  different  sub-topics  and  (2)  documents  are  composed  of  multiple  topics.  Due  to 
space  constraints  for  this  thesis,  examples  of  the  most  probable  documents  are  not 
included. 


4-1-4  Individuals  with  Clandestine  Interests.  The  last  metric  for  the  useabil- 
ity  of  results  is  the  percentage  of  individuals  with  clandestine  interests.  Unfortunately, 
a  problem  with  this  metric  quickly  emerges  when  the  results  are  produced.  If  a  person 
has  six  emails  and  the  only  time  a  particular  topic  occurs  is  on  one  external  email,  the 
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individual  is  considered  to  have  a  clandestine  interest.  However,  this  is  most  likely  not 
the  case  and  is  merely  the  result  of  the  small  dataset  for  that  individual.  The  average 
number  of  emails  sent  and  received  for  Enron  employees  is  71.  So,  by  examining  the 
distribution  of  numbers  of  emails,  it  is  clear  that  it  follows  an  exponential  distribution 
with  (3  =  70.  Therefore,  by  including  everyone  who  received  at  least  12  emails,  85% 
of  the  population  is  included  but  the  small  dataset  problem  is  predominately  avoided. 

PLSI-U  produces  a  total  across  all  48  categories  of  652  individuals  with  clandes¬ 
tine  interests  for  the  dictionary  words  dataset  and  304  individuals  with  clandestine 
interests  for  the  all  words  dataset  (no  dictionary).  This  means  on  average  each  cate¬ 
gory  has  less  than  14  (6)  individuals  with  a  clandestine  interest.  On  the  other  hand, 
Author  Topic  produces  a  total  across  47  categories  of  3,988  individuals  for  the  dic¬ 
tionary  words  dataset  and  1,593  individuals  for  the  all  words  (no  dictionary)  dataset. 
This  means,  on  average,  each  category  has  slightly  more  than  84  (34)  individuals 
with  clandestine  interests.  What  makes  this  curious  is  that  Author  Topic  seems  upon 
inspection  to  define  the  topics  more  sharply  but  PLSI-U  reveals  fewer  individuals 
with  clandestine  interests  despite  similar  sized  interest  networks.  However,  this  may 
simply  point  to  PLSI-U’s  inability  to  find  clandestine  interests  because  the  categories 
are  not  finely  enough  defined.  Author  Topic  finds  that,  on  average,  between  0.1%  and 
0.2%  of  Enron  employees  have  a  clandestine  interest  in  a  specific  topic  while  PLSI-U 
only  finds  between  0.02%  and  0.03%  of  Enron  employees. 

Both  models  produced  a  manageable  number  of  potential  insider  threats.  What 
cannot  be  determined  from  this  data  is  if  PLSI-U  produces  too  few,  excluding  valid 
potential  insider  threats,  or  Author  Topic  too  many,  creating  more  work  for  the 
managers. 

4-2  Timeliness  Metric 

In  addition  to  the  useability  metric,  there  are  two  other  categories  of  metrics  to 
consider.  The  first  concerns  the  timeliness  of  the  results.  The  algorithms  are  run  on 
approximately  34,000  employees  containing  a  total  of  approximately  250,000  emails. 
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Using  a  dictionary  of  stemmed  words,  Author  Topic  is  able  to  produce  results  after  8 
days  (6  days  of  the  algorithm  running  and  2  days  to  analyze  the  data)  while  PLSI-U 
took  9.5  days  (7.5  days  of  the  algorithm  running  and  2  days  to  analyze  the  data). 
In  both  cases,  the  results  are  returned  in  a  period  of  time  that  makes  monthly  or 
quarterly  processing  of  a  company’s  email  traffic  feasible. 

4-3  Validity  Metrics 

Finally,  a  technique  that  is  useable  and  timely  is  still  useless  if  the  results  it 
produces  are  not  valid.  Therefore,  the  final  metric  examined  checks  if  Author  Topic 
and  PLSI-U  produce  valid  results.  First,  they  are  checked  to  see  if  Sherron  Watkins 
emerges  as  an  insider  and  second  they  are  checked  to  see  if  common  interests  emerge 
between  people  as  expected. 

4-3.1  Sherron  Watkins  as  a  Potential  Insider.  To  see  if  Sherron  Watkins 
emerges  as  a  potential  insider,  she  is  checked  to  see  if  she  has  a  clandestine  interest 
in  the  off-book-partnerships  and  if  she  feel  alienated.  To  see  if  she  feels  alienated,  she 
is  checked  for  a  clandestine  interest  in  a  socializing  topic. 

4  -3. 1.1  Off-Book-Partnerships.  First  consider  the  Author  Topic 
dataset  restricted  to  dictionary  words  (Figure  ??).  The  first  step  an  investigator 
must  take  is  to  determine  what  topic  he  is  interested  in.  For  this  investigation,  the 
topic  concerns  the  off-book  partnerships.  While  the  initial  one  was  called  Rhythms, 
the  later  ones  were  called  LJM  1,  2,  and  3.  The  most  problematic  transactions 
performed  by  them  were  Raptor  I,  Raptor  II,  Raptor  III,  and  Raptor  IV.  In  order  to 
allow  comparisons  between  all  four  datasets,  only  the  word  “raptor”  is  used  to  find 
topics  (since  LJM  would  only  appear  in  the  non-dictionary  datasets). 

Excluding  a  general  “email”  topic,  the  four  topics  that  the  word  “raptor”  had  a 
non-zero  conditional  probability  for  are  topics  12  ( p{w  =  raptor\z  =  12)  =  0.0011),  25 
( p(w  =  raptor\z  =  25)  =  0.0004),  and  30  ( p{w  =  raptor\z  =  12)  =  0.0002).  Observe 


80 


that  topic  30  is  the  Research  category  discussed  above.  This  is  very  appropriate 
considering  that  the  Research  division  was  the  first  to  examine  and  then  reject  the 
feasibility  of  the  Raptors.  Despite  “raptor”  not  appearing  as  one  of  the  most  probable 
words,  with  most  probable  words  like  trade,  agreement,  credit,  swap,  and  financial, 
this  does  appear  to  be  a  topic  related  to  the  Raptors. 

STEP  1:  DETERMING  THE  TOPIC  TO  INVESTIGATE 


Non-zero  Raptor  probabilities  - 

p(zlw) 

Topic  12  Financial  Trade  Agreements 

0.11% 

Topic  25  Financial  Risk  Management 

0.04% 

Topic  30:  Resarch 

0.02% 

Financial  Trade  Agreements 

CATEGORY  12 

Trade 

2.0$ 

Copy 

1.6% 

Agreement 

1.3% 

receive 

1.2% 

Executive 

1.2% 

click 

1.2% 

Credit 

1.1% 

Swap 

1.1% 

Financial 

1.0% 

STEP  2:  FIND  INDIVIDUALS  WITH  CLANDESTINE  INTERESTS  IN  THE  TOPIC 


Clandestine  Interests 

-  Topic  12  Financial  Trade  Agreements 

Stacey  Ramsey 

Angela  Liknes 

K.  Longoria 

John  Distumal 

Corbin  Barnes 

Ilan  Caplan 

Kimberly  Hardy 

Dave  Kistler 

Peter  Berger 

Andrea  Reed 

Sherron  Watkins 

Edosa  Obayagbona 

Trevor  Randolph 

Frank  Lobdell 

Mac  Mclelland 

Junellen  Pearson 

Kelly  Lovvorn 

Joshua  Koenig 

Mika  Watanabe 

John  Bottomley 

Mark  Haedicke 

Tori  Hayden 

Michelle  Schultz 

Esther  Gerratt 

Jayanta  Sengupta 

Nikole  Vander 

Mchael  Nanny 

Bryan  Garrett 

Adam  Pollock 

Cecil  John 

Carmella  Jones 

Victoria  McDaniel 

Habiba  Bayi 

Felicia  Solis 

Anita  Grandos 

Kimberly  Nelson 

James  Puntumapanitch  Adriana  Wynn 

Jim  Roth 

Michael  Rump 

Melissa  Allen 

Olivier  Herbelot 

Nelly  Carpenter 

Michele  Beffer 

Katherine  Chisley 

Laura  Johnson 

Clay  Spears 

Patrick  Conner 

Jeffrey  Austin 

John  Boomer 

Tom  Halpin 

Mary  Hubbard 

Darla  Steffes 

Omar  Aboudaher 

Lena  Kasbekar 

Peter  Traung 

James  Foster 

Gardiner  Corby 

Robert  Pickel 

Duncan  Croasdale 

Peter  Maheu 

Warren  Schick 

Joe  Hoang 

Barbara  Hankins 

Christi  Nicolay 

Jay  Johnson 

Brenda  Funk 

Fabian  Valle 

Llewelyn  Hughes 

Linda  Noske 

Jesse  Alvorado 

Figure  4.10:  Investigating  if  Sherron  Watkins  is  an  Insider  for  Author  Topic  re¬ 

stricted  to  Dictionary  Words 


The  next  step  is  checking  which  individuals  have  clandestine  interests  in  these 
topics.  Although  this  investigation  needs  to  be  performed  for  all  three  topics,  only 
topic  12  is  shown  here.  Observe  that  Sherron  Watkins  is  one  of  the  71  individuals 
who  emerge  as  having  a  clandestine  interest  in  this  topic.  Recall  that  this  means 
that  although  she  has  an  interest  in  financial  trade  agreements,  she  never  sent  or 
received  an  email  from  anyone  at  Enron  about  them.  Observe  that  in  this  case  the 
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investigation  is  being  performed  after  the  fact.  Therefore,  no  additional  investigation 
is  necessary  and  Sherron  Watkins  emerges  as  a  potential  insider.  If  this  was  being 
performed  to  generate  potential  insider  threat  leads,  the  next  step  would  be  to  talk 
to  the  managers  of  each  of  the  71  individuals  and  dig  deeper  to  determine  whether 
these  insider  threat  leads  merit  further  attention. 

Unfortunately,  when  the  same  steps  are  performed  on  the  other  datasets,  the 
results  are  not  as  promising.  When  the  restriction  is  removed  for  Author  Topic, 
only  two  topics  emerge  as  probably  related  to  the  word  “raptor”:  topics  2  (p(w  = 
raptor\z  =  12)  =  0.0011)  and  25  (p(w  =  raptor\z  =  25)  =  0.0004)  where  topic  25 
is  again  the  Research  topic  and  topic  2  appears  to  be  a  legal  topic.  Despite  Sherron 
Watkins  being  interested  in  9  different  topics,  none  of  them  are  either  topic  2  or  25. 
As  a  result,  she  does  not  emerge  as  a  potential  insider  for  this  experiment. 

After  performing  the  investigation  on  the  Author  Topics  datasets,  the  next  step 
is  to  perform  them  on  the  PLS1-U  datasets.  Raptor  does  not  appear  to  have  a  non¬ 
zero  probability  for  any  topic  when  PLSI-U  is  run  restricted  to  dictionary  words. 
However,  it  does  appear  when  PLSI-U  is  run  without  the  restriction.  The  five  topics 
with  the  highest  conditional  probabilities  are  topics  3  (p(w  =  raptorjz  =  3)  =  0.0044), 
33  (p(w  =  raptor\z  =  33)  =  0.0008),  41  (p(w  =  raptor\z  =  41)  =  0.0005),  39 
(p(w  =  raptor\z  =  39)  =  0.0004),  and  44  (p(w  =  raptor\z  =  44)  =  0.0004).  Observe 
that  category  44  is  the  Research  category  discussed  above.  This  is  very  appropriate 
considering  that  the  Research  division  was  the  first  to  examine  and  then  reject  the 
feasibility  of  the  Raptors.  Topic  3  appears  to  be  about  credits  and  market  swaps 
and  topic  39  about  trading  electricity.  Unfortunately,  it  is  difficult  to  identify  any 
topic  descriptions  for  topics  41  and  33.  The  next  step  is  checking  which  individuals 
have  clandestine  interests  in  these  topics.  While  several  individuals  emerge  as  having 
clandestine  interests  in  each  of  these  topics,  Sherron  Watkins  is  never  one  of  them. 

4-3. 1.2  Alienation.  Similiar  steps  are  then  taken  to  see  if  Ms. 
Watkins  appeared  to  feel  alienated  at  work.  For  this  analysis,  there  is  no  clear  word 
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that  defines  socializing,  and  so  several  are  used.  Appropriate  words  include  dinner, 
drink,  fun,  tonight,  love,  weekend,  family  and  game.  In  each  case,  only  one  or  two 
topics  emerge  as  having  a  non-zero  probability  for  each  word. 

The  results  appear  to  parallel  those  for  the  off-book  partnerships.  For  the 
Author  Topic  dataset  restricted  to  dictionary  words,  Ms.  Watkins  emerges  as  having 
a  clandestine  interest  in  one  of  the  two  socializing  topics  and  no  interest  in  the  other. 
Therefore,  for  this  experiment,  she  emerges  (along  with  226  other  individuals)  as 
possibly  feeling  alienated  and  a  potential  insider  threat.  When  this  result  is  combined 
with  individuals  who  had  clandestine  interests  in  the  off-book  partnerships,  only  two 
other  names  emerge,  Dave  Kistler  and  Llewlyn  Hughes.  Therefore,  if  this  had  been  a 
real  world  case  and  the  CFO  had  combined  results  between  these  two  topics,  he  could 
have  quickly  zeroed  in  on  Watkins  as  a  possible  insider  threat. 

As  with  the  off-book-partnerships,  the  results  from  the  other  datasets  are  not 
as  promising.  Ms.  Watkins  does  emerges  as  having  an  interest  in  a  socializing  topic 
when  Author  Topic  is  run  without  the  dictionary  restriction.  However,  her  interest  in 
this  case  is  not  clandestine  (she  received  one  email  related  to  this  topic).  In  addition, 
when  the  analysis  is  performed  for  PLSI-U,  she  does  not  emerge  as  having  an  interest 
in  the  socializing  topics  at  all. 

4-3. 1.3  Summary.  I11  summary,  Sherron  only  emerges  as  having 
a  clandestine  interest  in  the  “raptor”  topic  and  feeling  alienated  for  Author  Topic 
(dictionary).  While  it  is  encouraging  that  she  emerges  as  a  potential  insider  for  Author 
Topic  (dictionary),  it  is  disconcerting  that  she  does  not  also  emerge  for  Author  Topic 
(no  dictionary).  One  possible  explanation  for  this  is  the  huge  number  of  names  that 
emerge  as  the  most  common  words  for  topics.  It  would  be  informative  to  re-run  the 
Author  Topic  (no  dictionary)  test  with  all  proper  names  stripped. 

4-3.2  Common  Interests.  The  second  metric  of  validity  is  checking  to  see 
if  Ken  Lay,  Jeff  Skilling,  and  Andy  Fastow  share  similar  interests.  However,  we 
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have  already  seen  that  they  do  indeed  share  one,  the  Senior  Mgmt  topic,  since  it 
was  picked  by  finding  a  topic  of  interest  they  all  had  in  common.  This  was  true 
for  all  four  experiments.  The  second  test  of  common  interests  is  to  see  if  Andy 
Fastow  and  Michael  Kopper  emerge  as  having  a  shared  interest  in  the  “raptor”  topic. 
Unfortunately  PLSI  generates  no  interests  for  Fastow  and  Kopper  when  run  restricted 
to  dictionary  words.  However,  when  run  without  this  restriction,  it  extracts  seven 
topics  of  interest  for  Fastow  and  two  for  Kopper,  including  one  in  common,  topic  24. 
While  topic  24  does  not  emerge  as  being  related  to  the  Raptors  (see  Section  ??),  it 
does  demonstrate  a  common  interest  between  the  two.  When  Author  Topic  is  checked 
for  common  interests  between  Fastow  and  Kopper,  the  results  are  similar.  When  run 
restricted  to  dictionary  words,  the  only  topic  they  appear  to  have  in  common  is  one 
related  to  trading  energy.  Finally,  when  run  without  the  restriction,  they  share  an 
interest  in  a  topic  related  to  home  life. 

The  results  for  this  metric  are  mixed  with  promising  results  for  all  experiments 
when  considering  Lay,  Skilling  and  Fastow  and  only  negative  results  when  considering 
Kopper  and  Fastow.  However,  this  may  be  due  to  the  scarcity  of  emails  for  Kopper 
(only  10)  resulting  in  both  techniques  having  difficulties  assigning  him  to  the  correct 
clusters. 

4-4  Additional  Experimental  Results 

Now  that  the  probabilistic  clustering  and  social  networking  techniques  have  been 
shown  to  be  useful,  there  are  two  final  pieces  of  analysis  to  perform.  The  first  concerns 
the  number  of  topics.  As  discussed  in  Chapter  ??,  the  number  of  topics  was  selected 
due  to  hardware  and  software  constraints.  A  logical  question  is  whether  a  second 
iteration  of  probabilistic  clustering  can  be  performed  on  a  single  topic  and  extract 
48  additional  sub-topics.  The  second  analysis  concerns  the  social  networks.  While 
they  have  already  demonstrated  their  worth  by  revealing  the  clandestine  interests  of 
various  Enron  employees,  there  is  still  more  information  that  can  be  extracted  from 
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them.  By  performing  several  traditional  social  network  analysis  techniques,  it  may 
be  possible  to  glean  additional  useful  information  from  the  networks. 

4-4-1  Analysis  of  the  Two-Tiered  Approach.  After  performing  the  above 
analysis,  it  is  reasonable  to  wonder  if  the  California  Crisis  category  can  be  expanded 
into  multiple  sub-topics  that  might  allow  for  a  better  analysis  of  Enron’s  alleged 
duplicity  in  California.  To  test  this,  the  Author  Topic  and  PLSI-U  topics  from  the 
stemmed  word  (no  dictionary)  dataset  are  considered.  The  documents  and  individuals 
that  have  a  significant  probability  of  this  topic  are  extracted  and  PLSI-U  and  Author 
Topic  are  performed  on  them  a  second  time. 

The  results  are  not  optimistic.  When  a  second  level  of  probabilistic  clustering 
is  performed  on  the  PLSI-U  results,  no  additional  topics  emerge.  There  is  only  one 
recognizable  topic  that  emerges  and  it  is  a  combination  of  the  California  Crisis  and  the 
Scheduling  topics  (Figures  ??  and  ??).  The  most  probable  words  for  the  remaining 
topics  have  such  a  low  probability  that  it  suggests  that  no  clustering  occurred.  While 
reclustering  the  Author  Topic  dataset  does  produce  48  different  identifiable  topics, 
only  two  topics  emerge  that  are  related  to  the  California  Crisis.  The  remaining  topics 
are  consistent  with  the  topics  that  emerged  initially  when  Author  Topic  was  run. 

On  a  positive  note,  this  admittedly  limited  analysis  suggests  that  the  number 
of  topics  used  for  the  original  analysis  is  appropriate.  If  the  number  of  topics  had 
been  too  small,  additional  topics  unrelated  to  the  original  meta-topics  would  have 
emerged.  That  they  did  not  suggests  that  they  are  visibly  accounted  for  in  the 
original  clustering. 

f.f.2  Social  Network  Analysis.  The  creation  of  social  networks  provides 
the  opportunity  to  test  the  effectiveness  of  traditional  social  network  analysis  (SNA) 
011  extracting  potential  insider  threats.  First,  several  centrality  measurements  are 
reviewed  to  see  if  SNA  extracts  appropriate  individuals  as  being  representative  of 
a  particular  topic.  If  so,  SNA  can  also  be  used  to  identify  individuals  central  to 
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(possibly  undesirable)  topics.  Second,  the  effectiveness  of  SNA  to  determine  the 
positions  of  individuals  based  on  their  SNA  measurements  is  considered.  Third,  an 
explicit  network  without  topics  is  reviewed  to  see  if  comparing  it  with  the  explicit 
topic  networks  provides  additional  insight.  Finally,  a  temporal  analysis  is  performed 
to  see  if  it  is  possible  to  view  the  movement  of  a  topic  across  the  organization  through 
an  analysis  of  email. 

4-4-2.1  Measurements.  There  are  additional  methods  for  consider¬ 
ing  the  explicit  social  networks  generated  by  the  categorized  emails.  As  discussed 
in  Section  ??,  there  are  several  measurements  of  centrality  that  can  be  performed 
on  these  networks.  It  is  of  interest  to  compare  the  individuals  that  emerge  as  the 
most  important  by  these  measurements  to  those  that  emerge  as  being  most  probable 
from  the  implicit  social  networks  generated  by  the  interest  profiles.  First,  consider 
Figure  ??.  It  is  readily  apparent  that  while  the  most  probable  individuals  and  the 
individuals  with  the  largest  centrality  measurements  are  fairly  different,  there  is  very 
little  difference  between  centrality  rankings.  Between  the  top  three  measurements, 
only  five  individuals  do  not  appear  in  at  least  two  of  the  rankings.  This  phenomenon 
repeats  for  all  four  topics  across  all  four  experiments.  Therefore,  for  brevity,  only  the 
top  ten  most  central  individuals  based  on  betweenness  for  each  of  the  four  sample 
topics  are  shown  for  each  of  the  four  experiments  (Figures  ??,  ??,  ??,  and  ??). 
Each  of  these  shows  a  quality  of  results  similar  to  what  is  seen  with  the  most  probable 
individuals  suggesting  that  results  returned  this  way  may  also  be  used  for  extracting 
individuals  associated  with  a  (possibly  undesirable)  topic.  As  with  the  most  probable 
individuals,  PLSI-U  appears  better  at  extracting  appropriate  individuals  as  the  most 
central.  While  there  is  no  efficient  way  to  determine  the  smallest  number  of  individu¬ 
als  that  will  most  efficiently  fracture  the  network,  those  individuals  deemed  the  most 
important  can  certainly  be  considered  critical  to  efficient  communication  within  the 
social  network. 


The  same  performance  comparison  made  on  individual  actors  between  PLSI- 
U  and  Author  Topic  can  also  be  made  on  the  networks  as  a  whole.  By  looking  at 
the  cohesion  measurements  (Table  ??),  one  can  see  that  PLSI-U  produces  clusters 
with  a  higher  degree  cohesiveness  (more  vertices  with  the  same  degrees),  closeness 
cohesiveness  and  in  some  cases  higher  between  measurements.  This  suggests  that 
the  PLSI-U  clusters  may  be  more  cohesive  than  the  Author  Topic  ones.  This  may  be 
born  out  further  when  one  considers  the  number  of  components  in  the  social  networks 
among  the  four  experiments  (Table  ??.  PLSI-U  run  without  a  dictionary  has  by  far 
the  fewest  number  of  components  (in  some  cases  only  1)  followed  by  PLSI-U  run  with 
a  dictionary.  This  may  be  due  to  the  fact  that  the  clusters  have  far  fewer  vertices  in 
them.  However,  since  the  measurements  are  normalized  to  remove  graph  size  from 
the  metric,  this  is  still  indicative  of  more  cohesive  clusters. 


MOST  PROBABLE  INDIVIDUALS 

James  Derrick 

General  Counsel 

3.60% 

Cindy  Olson 

Head  of  Human  Resources 

1.98% 

Kay  Chapman 

Secretary  of  Management  Committee 

1.67% 

Mark  Koenig 

Executive  Vice  President  of  Investor  Relations 

1.67% 

Greg  Whalley 

President  of  Enron 

1.66% 

Steven  Kean 

Chief  of  Staff,  Government  Relations  Specialist  1.57% 

Mark  Frevert 

Vice-Chairman  of  Enron 

1.54% 

Jeffrey  McMahon 

Chief  Financial  Officer  of  Enron 

1.46% 

Kenneth  Lay 

Chairman  of  Enron 

1.25% 

David  Delainey 

Enron  Energy  Services  CEO 

1.19% 

DEGREE 

Joannie  Williamson  Chief  Executive  Jeff  Skilling’s  Secretary 

0.38 

Rosalee  Felming 

Chairman  Ken  Lay’s  Secretary 

0.28 

Bobbie  Power 

0.24 

Billy  Lemmons 

0.24 

Rebecca  Carter 

Executive  Secretary  to  Enron  Board 

0.22 

Sherri  Sera 

Chief  Executive  Jeff  Skilling’s  Personal  Assistant  0.21 

Liz  Taylor 

0.21 

Paula  Rieker 

Deputy  Director  of  Investor  Relations 

0.21 

Billy  Dorsey 

0.20 

David  Delainey 

Chief  Executive  of  Enron  Energy  Services 

0.20 

BETWEENNESS 

Joannie  Williamson  Chief  Executive  Jeff  Skilling’s  Secretary 

0.21 

Bobbie  Power 

0.09 

Traci  Ralston 

0.09 

Billy  Lemmons 

0.07 

David  Delainey 

Chief  Executive  of  Enron  Energy  Services 

0.06 

Jeff  Skilling 

Chief  Executive 

0.06 

Cindy  Olson 

Head  of  Human  Resources 

0.06 

Rosalee  Felming 

Chairman  Ken  Lay’s  Secretary 

0.06 

Paula  Rieker 

Deputy  Director  of  Investor  Relations 

0.05 

Liz  Taylor 

0.04 

CLOSENESS 

Joannie  Williamson  Chief  Executive  Jeff  Skilling’s  Secretary 

0.061 

Rosalee  Felming 

Chairman  Ken  Lay’s  Secretary 

0.060 

David  Delainey 

Chief  Executive  of  Enron  Energy  Services 

0.059 

Liz  Taylor 

0.059 

Sherri  Sera 

Chief  Executive  Jeff  Skilling’s  Personal  Assistant  0.059 

Kay  Chapman 

Secretary  of  Management  Committee 

0.059 

Paula  Rieker 

Deputy  Director  of  Investor  Relations 

0.059 

Nicki  Daw 

0.059 

Jeff  Skilling 

Chief  Executive 

0.059 

Cindy  Olson 

Head  of  Human  Resources 

0.059 

Figure  4.11:  Senior  Mgmt  Topic  -  PLSI-U  w/o  Dictionary  -  Email  Graph  Centrality 
Measurements 


4-4- 2. 2  Position  Classification.  During  the  development  of  the  re¬ 
search  question,  it  was  proposed  that  by  using  simple  SNA  metrics,  the  positions  of 


87 


Table  4.1:  Group  Centrality  Measurements 


Topic 

Vertices 

Group 

Degree 

Graph 

Density 

Group 

Closeness 

Group 

Betweenness 

SENIOR  MGMT 

PLSI-U  Diet 

45 

4800 

0.18445 

0.00083 

0.00222 

0.29961 

Author  Topic  Diet 

9 

9817 

0.08249 

0.00071 

0.00075 

0.13983 

PLSI-U  NoDict 

11 

606 

0.36260 

0.02149 

0.00466 

0.20774 

Author  Topic  NoDict 

8 

10862 

0.11467 

0.00083 

0.00053 

0.09985 

CALIFORNIA  CRISIS 

PLSI-U  Diet 

2 

619 

0.19304 

0.00647 

0.00232 

0.27696 

Author  Topic  Diet 

43 

7434 

0.12082 

0.00094 

0.00072 

0.21977 

PLSI-U  NoDict 

0 

168 

0.75254 

0.05988 

0.34176 

0.27509 

Author  Topic  NoDict 

18 

2653 

0.25123 

0.00339 

0.00077 

0.23940 

RESEARCH 

PLSI-U  Diet 

47 

2855 

0.14328 

0.00140 

0.00328 

0.33934 

Author  Topic 

30 

7437 

0.10751 

0.00081 

0.00053 

0.15977 

PLSI-U  NoDict 

44 

420 

0.49411 

0.01909 

0.01826 

0.43733 

Author  Topic  NoDict 

25 

2949 

0.16070 

0.00136 

0.00070 

0.26934 

INFO  TECHNOLOGY 

PLSI-U  Diet 

40 

532 

0.36089 

0.00377 

0.00151 

0.46753 

Author  Topic  Diet 

28 

7415 

0.10942 

0.00094 

0.00078 

0.16977 

PLSI-U  NoDict 

7 

687 

0.38592 

0.01020 

0.16711 

0.54800 

Author  Topic  Diet 

10 

9232 

0.16899 

0.00065 

0.00087 

0.25984 

Table  4.2:  Topic  Component  Statistics 


Topic 

Vertices 

Vertices 
in  Biggest 
Component 

Percentage 

Number  of 
Components 

SENIOR  MGMT 

PLSI-U  Diet 

45 

4800 

4777 

99.52 

8 

Author  Topic  Diet 

9 

9817 

9776 

99.58 

16 

PLSI-U  NoDict 

11 

606 

591 

97.52 

3 

Author  Topic  NoDict 

8 

10862 

10811 

99.53 

22 

CALIFORNIA  CRISIS 

PLSI-U  Diet 

2 

619 

585 

94.51 

10 

Author  Topic  Diet 

43 

7434 

7388 

99.38 

18 

PLSI-U  NoDict 

0 

168 

168 

100.00 

1 

Author  Topic  NoDict 

18 

2653 

2600 

98.00 

17 

RESEARCH 

PLSI-U  Diet 

47 

2855 

2835 

99.30 

9 

Author  Topic 

30 

7437 

7383 

99.27 

18 

PLSI-U  NoDict 

44 

420 

414 

98.57 

3 

Author  Topic  NoDict 

25 

2949 

2890 

98.00 

11 

INFO  TECHNOLOGY 

PLSI-U  Diet 

40 

532 

420 

78.95 

17 

Author  Topic  Diet 

28 

7415 

7374 

99.45 

16 

PLSI-U  NoDict 

7 

687 

687 

100.00 

1 

Author  Topic  Diet 

10 

9232 

9194 

99.59 

16 

CATEGORY  45 

SENIOR  MGMT 

Tracey  Kozadinos 

0.30 

Jeff  Skilling 

Chief  Executive  of  Enron 

0.22 

Constance  Charles 

Human  Resources  -  Associate/  Analyst  Program  0.17 

Steven  Kean 

Chief  of  Staff  -  Government  Relations  Speciahst  0.15 

Rosalee  Fleming 

Secretary  for  Chairman  Ken  Lay 

0.15 

Rhonda  Denton 

0.04 

Bill  Donovan 

0.04 

Brian  Ripley 

0,04 

Janet  Butler 

0.04 

Rhonda  Denton 

0.04 

CATEGORY  2 

CALIFORNIA  CRISIS 

AlanComnes 

Director  of  Government  Affairs  in  California 

0.28 

Kenneth  Lay 

Chairman  of  Enron 

0.25 

Simone  La 

0.13 

Clayton  Seigle 

0.12 

Jeff  Dasovich 

Government  Affairs  Executive 

0.08 

Steven  Kean 

Chief  of  Staff  -  Government  Relations  Specialist 

0.01 

Karen  Denne 

Vice  President  of  Public  Relations 

0.07 

Ginger  Dernehl 

Admin  Assistant  -  Global  Government  Affairs 

0.07 

Richard  Shapiro 

VP  of  Regulatory  Affairs,  Chief  DC  Lobbyist 

0.07 

Leonardo  Pacheco 

0.06 

CATEGORY  47  RESEARCH 

CATEGORY  40  INFO  TECHNOLOGY 

Vince  Kaminski  Managing  Director  and  Head  of  Research  0.34 

Outlook  Team  0.15 

Jewel  Meeks  0.11 

Kristin  Gandy  Associate  Recruiter  for  Enron  0.09 

Shirley  Crenshaw  Research  GRoup  Administrative  Coordinator  0.09 

Jeff  Dasovich  Government  Affairs  Executive  0.08 

Nicki  Daw  0.08 

Richard  Shapiro  VP  of  Regulatory  Affairs,  Chief  DC  Lobbyist  0.07 

Ashley  Baxter  Recruiter -Global  Technology  Track  0.07 

Althea  Gordon  Recruiter -Associates/ Analyst  Program  0.07 

Cheryl  Johnson  0.47 

Outlook  Team  0.26 

EmmaWelsch  0.15 

Jim  Schwieger  Vice  President  in  Gas  Trading  Division  0.12 

Juhe  Meyers  0.10 

Darren  Vanek  Credit  Analyst -Credit  Risk  Management  0.0! 

Carolyn  Gilley  Enron  Networks- Information  &  Records  Mgmt  0.08 

Geoff  Storey  0.08 

Kevin  Dumas  0.06 

Daren  Farmer  Logistics  Manager  0.05 

Figure  4.12:  PLSI-U  Betweenness  Centrality  Measurements  with  Sample  Categories 
using  Stemmed,  Dictionary  Words 


individuals  within  an  organization  can  be  determined.  For  instance,  perhaps  those 
individuals  with  many  interests  are  likely  to  be  administrative  assistants  (or  execu¬ 
tives)  since  they  are  likely  to  be  pulled  in  more  directions  than  average  employees. 
The  results  are  unclear  (Table  ??).  While  most  of  the  individuals  are  at  least  vice 
presidents,  there  seems  to  be  a  significant  bias  to  individuals  involved  in  the  manip¬ 
ulation  of  California’s  deregulation  effort  to  increase  Enron  profits  (e.g.  Tim  Belden, 
Christian  Yoder,  John  Lavorato,  and  David  Parquet).  This  suggests  that  the  nature  of 
the  topics  rather  than  individuals’  positions  within  the  company  are  the  determining 
factor  in  the  number  of  topics  of  interest  an  individual  possesses. 

Similarly,  when  looking  at  individuals  with  the  most  external  email  activity 
(Table  ??)  the  results  are  also  mixed.  In  some  cases,  it  is  evident  that  the  position 
dictates  the  email  traffic.  The  individual  most  responsible  for  government  affairs  has 
the  most  emails  and  the  Chairman  has  the  eighth  most  emails.  However,  in  other 
cases,  it  is  not  as  obvious.  There  are  two  lawyers  as  well  as  the  Head  of  Research  also 
in  the  top  ten.  While  the  lawyers  can  be  explained  by  considering  the  trouble  Enron 


CATEGORY  9 

SENIOR  MGMT 

Jeff  Skilling 

Chief  Executive 

0.14 

Bob  Ambrocik 

Technical  Consultant 

0.12 

Sally  Beck 

Chief  Operating  Officer  of  Enron  Networks 

0.08 

Rosalee  Fleming 

Secretary  to  Chairman  Ken  Lay 

0.07 

Outlook  Team 

0.06 

Omaha  Help  Desk 

0.05 

Shelley  Corman 

VP  Regulatory/  Gov't  Affairs,  Asst  Gen  Counsel  0.04 

Technology.Enron 

0.04 

Bodyshop 

0.04 

Jeff  Dasovich 

Government  Affairs  Executive 

0.03 

CATEGORY  43 

CALIFORNIA  CRISIS 

Kenneth  Lay 

Chairman 

0.22 

Jeff  Dasovich 

Government  Affairs  Executive 

0.09 

Outlook  Team 

0.06 

Veronica  Espinoza 

0.05 

Cynthia  Morrow 

0.05 

Steven  Kean 

Chief  of  Staff,  Government  Relations  Specialist 

0.04 

Ginger  Dernehl 

Admin  Assistance  -  Global  Government  Affairs 

0.04 

Shelley  Corman 

VP  Regulatory/Gov’t  Affairs,  Asst  Gen  Counsel 

0.04 

NickiDaw 

0.04 

Maureen  Me  Vicker 

0.03 

CATEGORY  30 

RESEARCH 

CATEGORY  28 

INFO  TECHNOLOGY 

Bob  Ambrocik 

Technical  Consultant 

0.16 

Bob  Ambrocik 

Technical  Consultant 

0.17 

Outlook  Team 

0.13 

Sally  Beck 

Chief  Operating  Officer  of  Enron  Networks 

0.09 

Vince  Kaminski 

Managing  Director  and  Head  of  Research 

0.08 

ILisa  Jones 

0.07 

Leann  Walton 

0.06 

Technology.Enron 

0.06 

Sally  Beck 

Chief  Operating  Officer  of  Enron  Networks 

0.05 

Outlook  Team 

0.05 

Jeff  Dasovich 

Government  Affairs  Executive 

0.04 

Rick  Buy 

Executive  Vice  President  k  Chief  Risk  Officer 

0.05 

Andrea  Richards 

0.04 

Lillian  Carroll 

0.05 

Shirley  Crenshaw 

Research  Group  Administrative  Coordinator 

0.03 

Arfan  Aziz 

0.05 

Vince  Kaminski 

Managing  Director  and  Head  of  Research 

0.03 

Shelley  Corman 

VP  Regulatory/  Gov’t  Affairs,  Asst  Gen  Counsel 

0.04 

Shelley  Corman 

VP  Regulatory/  Gov’t  Affairs,  Asst  Gen  Counsel  0.03 

Ted  Bland 

0.03 

Figure  4.13:  Author  Topic  Betweenness  Centrality  Measurements  with  Sample  Cat¬ 
egories  using  Stemmed,  Dictionary  Words 


was  in  at  the  time,  one  would  not  presume  a  priori  that  the  Head  of  Research  would 
lead  the  company  in  external  email  traffic.  Only  one  of  the  top  ten  individuals  is  a 
“salesman”  (Chris  Germany,  an  Enron  trader). 

The  lack  of  revealing  results  from  this  analysis  suggests  that  a  more  complex 
model  may  be  more  appropriate.  To  see  how  Author  Topic  can  be  extended  to 
classification  of  roles,  refer  to  McCallum,  et  al.  [?]. 


Table  4.3:  Individuals  with  the  most  interests 


Name 

Position 

Number  of  Interests 

Greg  Wolfe 

VP  of  Marketing 

8 

William  Bradford 

8 

Fred  Lagrasta 

VP  Risk  Management  Marketing 

8 

Scott  Neal 

VP  Enron  Capital  and  Trade  Resources  -  Latin  America 

8 

Leslie  Reeves 

8 

Peggy  Hedstrom 

VP  of  Energy  Operations  -  Calgary,  CA 

8 

Christian  Yoder 

Senior  Counsel  -  Portland,  OR 

8 

Debbie  Brackett 

Sr.  Director  -  Credit  Risk  Management 

9 

Tim  Belden 

Head  of  Enron  Energy  Trading  in  Portland 

9 

John  Lavorato 

CEO  of  Enron  North  America 

10 

David  Parquet 

Vice  President  of  Enron  North  America 

10 
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CATEGORY  11 

SENIOR  MGMT 

Joannie  Williamsson 

Secretary  to  CEO  Jeff  Skilling 

0.21 

Bobbie  Power 

0.09 

Tracy  Ralston 

0.07 

Billy  Lemmons 

0.07 

David  Delainey 

CEO  of  Enron  Energy  Services 

0.06 

Jeff  Skilling 

CEO  of  Enron 

0.06 

Cindy  Olson 

Head  of  Human  Resources 

0.06 

Rosalee  Fleming 

Secretary  to  Chairman  Ken  Lay 

0.06 

Paula  Rieker 

Deputy  Director  of  Investor  Relations 

0.05 

Liz  Taylor 

0.04 

CATEGORY  0 

CALIFORNIA  CRISIS 

Susan  Mara 

Director  of  Government  Affairs  in  California 

0.28 

Jeff  Dasovich 

Government  Affairs  Executive 

0.24 

AlanComnes 

Director  of  Government  Affairs 

0.12 

Joseph  Alamo 

0.09 

Sandra  McCubbin  Director  of  Government  Affairs  in  California 

0.09 

Dan  Leff 

0.08 

Tamara  Johnson 

0.06 

Michael  Tribolet  VP  of  Underwriting  and  Investment  Valuation 

0.06 

Leticia  Botello 

0.04 

Thomas  Bennett 

0.02 

CATEGORY  44  RESEARCH 

CATEGORY  7  INFO  TECHNOLOGY 

Vince  Kaminski  Managing  Director  and  Head  of  Research  0.44 

Vince  Kaminski  Managing  Director  and  Head  of  Research  0.18 

Shirley  Crenshaw  Research  Group  Administrative  Coordinator  0.16 

Ravi  Thuraisingham  Director  of  Global  Bandwidth  Risk  Management  0.07 
Anjam  Ahmad  0.05 

Anita  Dupont  0.05 

Vince  Kaminski  Managing  Director  and  Head  of  Research  0.05 

Vasant  Shanbhogue  Vince  Kaminski’s  Second  in  Command  0.04 

Zimin  Lu  Director  of  Valuation  and  Training  Analytic  Group  0.04 

Steven  Leppard  0.04 

Cynthia  Morrow  0.55 

Regan  Smith  Network  Administrator  0.17 

Georgia  Ward  QA  in  Development  Support  0.13 

Brandee  Jackson  0.09 

Bryce  Baxter  0.08 

Kenneth  Harmon  0.07 

Rita  Wynne  Manager  for  Volume  Management  Group  0.06 

Brian  Ripley  0.05 

Tony  Dugger  0.05 

Anwar  Melethil  0.04 

Figure  4.14:  PLSI-U  Betweenness  Centrality  Measurements  with  Sample  Categories 
using  Stemmed  Words  without  a  Dictionary 


4-4- 2. 3  An  Explicit  Network  without  Topics.  One  additional  question 
is  how  the  explicit  networks  generated  from  Author  Topic  and  PLSI-U  compare  to  a 
traditional  social  network  where  two  individuals  are  linked  if  an  email  exists  between 
them,  regardless  of  the  topic  discussed  in  the  email.  The  graph  level  metrics  are  as 
expected  when  considers  the  size  of  the  graph  (21,790  vertices).  The  group  degree  is 
0.068.  The  graph  density  is  0.0008.  The  group  closeness  is  0.00032  and  the  group 
betweenness  is  0.060.  All  of  these  measurements  suggest  a  sparse  graph  with  a  wide 
variation  in  the  number  of  edges  (emails)  connecting  different  individuals.  When  one 


Table  4.4:  Individuals  with  the  most  external  emails 


Name 

Position 

External 

Emails 

Internal 

Emails 

Jeff  Dasovich 

Government  Affairs  Executive 

4,104 

11,087 

Kay  Mann 

Legal  Counsel 

2,716 

6,417 

Matthew  Lenhart 

2,229 

2,193 

Vince  Kaminski 

Managing  Director  and  Head  of  Research 

2,015 

8,764 

Sara  Schackleton 

VP  and  General  Counsel  -  Enron  North  America 

1,971 

10,275 

Jeff  Dasovich 

Government  Affairs  Executive 

1,767 

337 

Chris  Germany 

Trader  for  Enron  North  America  -  East  Desk 

1,650 

4,797 

Ken  Lay 

Chairman 

1,538 

20 

Gerald  Nemec 

1,489 

5,633 

Tana  Jones 

1,119 

10,799 
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CATEGORY  8 

SENIOR  MGMT 

Saslly  Beck 

Chief  Operating  Officer  of  Enron  Networks 

0.10 

Technology.Enron 

0.10 

Kenneth  Lay 

Chairman 

0.10 

Billy  Lemmons 

0.05 

Outlook  Team 

0.05 

Jeff  Skilling 

Chief  Executive 

0.04 

Andrew  Wu 

0.04 

David  Oxley 

Human  Resources  Executive 

0.03 

Louise  Kitchen 

CEO  of  Enron  Online 

0.03 

David  Forster 

0.03 

CATEGORY  25 

RESEARCH 

Vince  Kaminski 

Managing  Director  and  Head  of  Research 

0.2' 

Khymberly  Booth 

0.18 

Shirley  Crenshaw 

Research  Group  Administrative  Coordinator 

0.14 

Iris  Mack 

Manager  in  Research  Group  -  Broadband  Services  0. 13 

Cheryl  Johnson 

0.12 

Vince  Kaminski 

Managing  Director  and  Head  of  Research 

0.10 

Maureen  Raymond 

Head  of  Country  Risk  &  Foreign  Exchange 

0.08 

Leann  Walton 

0.07 

Kathie  Grabstald 

Public  Relations  for  Enron  Wholesale  Services 

0.07 

Gwyn  Koepke 

0.06 

CATEGORY  18 

CALIFORNIA  CRISIS 

Jeff  Dasovich 

Government  Affairs  Executive 

0.24 

Mary  Hain 

Government  Affairs  lawyer 

0.10 

Ginger Dernehl 

Admin  Assistant  -  Global  Government  Affairs 

0.07 

Paul  Kaufman 

VP  &  Western  U.S.  regulatory  &  govt  lawyer 

0.06 

Susan  Mara 

Director  of  Government  Affairs  in  California 

0.06 

Richard  Shapiro 

VP  of  Regulatory  Affairs,  Chief  DC  lobbyist 

0.06 

Rhonda  Denton 

0.06 

LaraLeibman 

Manager  in  Government  Affairs 

0.06 

Christi  Niclay 

Director  of  Govt  Affairs  -  Electric  Power  Trading  0.05 

AlanComnes 

Director  of  Government  Affairs  in  California 

0.04 

CATEGORY  10 


INFO  TECHNOLOGY 


Outlook  Team  0.26 

Lynette  Crawford  0.11 

Sonya  Johnson  0.10 

Suzanne  Brown  0.09 

David  Forster  0.07 

Julie  Clyatt  0.07 

Constance  Charles  Human  Resources  -  Associate/  Analyst  Program  0.07 
Paulette  Obrecht  Legal  Project  Coordinator  0.04 

Kay  Chapman  Secretary  of  Management  Committee  0.03 

Khymberly  Booth  0.03 


Figure  4.15:  Author  Topic  Betweenness  Centrality  Measurements  with  Sample  Cat¬ 
egories  using  Stemmed  Words  without  a  Dictionary 

looks  at  the  individuals  who  are  most  central  to  the  graph,  some  of  the  senior  Enron 
individuals  emerge  (Figure  ??). 


Jeff  Skilling 

Chief  Executive 

0.06 

Outlook  Team 

0.06 

Saslly  Beck 

Chief  Operating  Officer  of  Enron  Networks 

0.04 

David  Forster 

0.04 

Kenneth  Lay 

Chairman 

0.04 

Jeff  Dasovich 

Government  Affairs  Executive 

0.0: 

Louise  Kitchen 

CEO  of  Enron  Online 

0.02 

John  Lavorato 

CEO  of  Enron  North  America 

0.02 

Jacqueline  Coleman 

0.02 

Technology  .Enron 

0.02 

Figure  4.16:  Most  central  individuals  in  a  traditional  Enron  social  network 


4- 4-2-4  Temporal  Analysis.  A  final  goal  of  this  research  is  to  track  a 
topic  as  it  progresses  through  the  organization’s  email.  Although  several  topics  were 
reviewed,  including  Jeff  Skilling’s  resignation,  Sherron  Watkins’  letter  to  Ken  Lay, 
Jeff  Skilling  and  Rebecca  Carter’s  marriage,  and  the  renaming  of  the  Astrodome  to 
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Enron  Field,  none  of  these  topics  emerge  as  moving  through  the  organization  by  email 
over  time.  Given  the  attractiveness  of  these  rumors,  it  is  suggestive  that  means  of 
communication  other  than  email  were  used  by  Enron  employees.  These  could  include 
either  the  telephone  or  talking  face-to-face.  It  is  also  possible  that  the  reason  is  the 
nature  of  the  Enron  dataset  itself.  Since  it  is  extracted  from  the  email  folders  of  only 
151  Enron  employees,  it  may  be  more  difficult  to  perform  this  temporal  analysis. 

4-5  Resources 

This  thesis  requires  the  development  and  use  of  multiple  computer  progams  to 
extract,  cluster,  and  analyze  the  data.  All  program  development  and  execution  has 
occurred  on  a  Pentium  4  2.6GHz  machine  with  512MB  RAM  running  Suse  Linux 
9.2.  The  programs  were  developed  in  C  and  C++  using  the  gcc  3.3.4-11  compiler.  In 
addition,  several  Unix  shell  scripts  were  developed  to  aid  in  processing.  The  database 
used  was  MySQL  version  4.1.12-max. 

4-6  Conclusions 

The  primary  purpose  of  this  research  is  to  determine  if  probabilistic  clustering 
and  social  networking  techniques  applied  to  email  are  effective  at  detecting  potential 
insider  threats.  Consider  the  results: 

4-6.1  Metric  1:  Useability.  Both  PLSI-U  and  Author  Topic  clearly  succeed 
at  this  metric.  The  topics  are  clearly  defined  by  their  most  probable  words  and 
the  most  probable  individuals  for  the  topics  are  highly  appropriate.  Both  provide 
appropriate  names  as  additional  leads  given  a  topic  of  interest,  although  both  perform 
better  when  words  are  not  restricted  to  the  dictionary.  This  can  be  further  improved 
for  Author  Topic  by  excluding  names  from  the  word  lists.  The  only  place  where 
PLSI-U  falls  short  is  in  the  number  of  individuals  extracted  with  clandestine  interests. 
While  Author  Topic  succeeds  at  this  metric  by  extracting  between  0.1%  and  0.2%  of 
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the  population  as  potential  insiders,  PLSI-U  fails  to  extract  enough.  This  is  born  out 
empirically  by  the  discovery  of  Sherron  Watkins  by  Author  Topic  and  not  by  PLSI-U. 

4-6.2  Metric  2:  Timeliness.  For  PLSI-U,  each  iteration  took  almost  two 
hours  to  perform,  resulting  in  a  total  run  time  of  almost  one  week.  Similarly,  Author 
Topic  took  between  6  and  9  hours  to  perform  100  iterations,  resulting  in  a  total  run 
time  of  also  approximately  a  week.  While  it  is  possible  to  imagine  a  larger  corpus, 
Enron’s  250,000  emails  and  35,000  individuals  seems  on  the  high  side  for  email  data 
collected  on  a  monthly  basis  even  for  organizations  that  make  extensive  use  of  email 
if  broadcast  messages  are  excluded.  Although  the  average  individual’s  email  is  only 
in  the  low  70s,  several  individuals  received  emails  in  the  thousands  and  even  tens  of 
thousands.  The  ability  to  produce  results  in  a  week  should  result  in  a  system  that  is 
sufficiently  responsive. 

4-6.3  Metric  3:  Validity.  For  Author  Topic  run  restricted  to  dictionary 
words,  Sherron  Watkins  emerges  as  a  potential  insider.  Furthermore  for  PLSI-U  run 
without  being  restricted  to  the  dictionary,  Sherron  Watkins  also  emerges  as  having  an 
interest  in  the  Raptor  topic.  However,  in  this  experiment,  she  does  not  appear  to  have 
a  clandestine  interest  in  it.  This  is  due  to  the  fact  that  despite  the  Raptor  topic  being 
one  of  her  two  principle  interests,  her  interest  is  not  great  enough  to  be  considered  a 
member  of  the  implicit  network  for  that  topic.  The  one  disconcerting  result  is  that 
Author  Topic  run  without  the  dictionary  does  not  cluster  Ms.  Watkins  with  a  Raptor 
topic.  Perhaps  additional  experiments  run  with  Author  Topic  not  restricted  to  the 
dictionary  but  with  all  names  excluded  will  produce  better  results. 

The  emergence  of  common  topics  of  interest  for  Lay,  Skilling,  and  Fastow  pro¬ 
vides  implicit  support  that  once  one  insider  is  known,  these  techniques  can  be  used 
to  extract  additional  ones.  Finally,  despite  the  fact  that  neither  Kopper  nor  Fas¬ 
tow  emerged  as  having  a  common  interest  in  any  of  the  Raptor  topics,  something 
interesting  does  emerge.  For  each  experiment,  the  Research  topic  emerged  as  being 
a  promising  topic  for  the  Raptors.  Presumably,  given  the  close  knitness  of  the  indi- 
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viduals  associated  with  this  topic,  that  members  of  the  Research  Group  would  have 
emerged  as  having  a  significant  interest  in  the  Raptors,  thus  providing  a  means,  after 
the  fact,  of  finding  people  to  talk  to  about  them. 

4-6-4  Social  Network  Analysis  (SNA).  Centrality  measurements  extract 
individuals  for  topic  very  similar  to  the  most  probable  individuals  extracted  by  PLSI- 
U  and  Author  Topic.  Interestingly,  whether  degree,  closeness,  or  betweenness  is  used, 
the  top  ranked  individuals  vary  only  slightly.  Also,  just  as  with  the  most  probable 
individuals,  PLSI-U  produces  more  cohesive,  better  identifiable  and  better  clustered 
results  than  Author  Topic.  Where  SNA  fails  to  perform  is  in  extracting  the  positions 
of  individuals  within  the  organization.  The  two  tests  performed  clearly  indicate  a  more 
complex  model  is  required.  In  addition,  SNA  fails  to  extract  any  indications  of  topics 
moving  through  the  dataset  over  time.  While  this  could  be  a  result  of  the  enormous 
amount  of  data,  the  two-tiered  approach  shows  that  the  48  topics  provide  a  reasonable 
amount  of  granularity  for  this  dataset.  Therefore,  it  is  unlikely  that  increasing  the 
number  of  topics  would  improve  the  results  of  the  SNA  analysis.  Instead,  it  is  likely 
that  the  Enron  dataset  itself  is  not  conducive  to  this  effort,  perhaps  due  to  the  dataset 
being  extracted  from  the  email  folders  of  only  151  Enron  employees. 
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V.  Discussion 


No  country  or  organization  can  take  on  the  military  might  of  the  United  States  of 
America  head  on.  The  Gulf  War  of  1991  is  the  last  time,  for  a  while,  that  a  country 
will  go  “toe  to  toe”  with  the  United  States  in  the  field.  What  has  occurred  and  will 
continue  to  occur  with  greater  and  greater  frequency  is  asymmetric.  While  the  de¬ 
struction  of  the  World  Trade  Center  and  the  attack  on  the  Pentagon  are  the  two  best 
known,  the  asymmetric  attacks  on  America’s  economy  are  no  less  destructive.  Unfor¬ 
tunately,  while  it  is  easy  to  distinguish  friend  from  enemy  when  it  comes  to  physical 
attacks,  the  problem  is  much  more  complex  for  economic  attacks.  In  addition  to  coun¬ 
tries  like  China,  it  is  considered  likely  that  supposed  allies  such  as  France,  Germany, 
and  Japan  provide  active  state  support  for  economic  espionage  [?].  While  it  is  possi¬ 
ble  to  insert  traditional  spies  into  organizations,  turning  those  people  who  were  once 
loyal  employees  has  emerged  as  the  most  effective  way  of  stealing  an  organization’s 
secrets. 

5.1  Experimental  Results 

As  such,  it  is  critical  to  find  ways  to  protect  the  information.  This  research 
proposes  one  method  for  detecting  potential  insider  threats  by  finding  individuals 
who  exhibit  some  of  the  warning  signs.  One  specific  warning  sign  many  individuals 
demonstrate  is  a  need  to  separate  themselves  from  the  organization  prior  to  betraying 
it.  This  separation  process  may  manifest  itself  by  individuals  feeling  alienated  by  the 
organization.  By  datamining  email,  individuals’  interests  can  be  extracted.  From 
those  interests,  one  can  predict  who  feels  alienated  from  an  organization  and  who 
is  hiding  their  interest  of  sensitive  topics.  The  goal  is  to  test  the  hypothesis  that 
“probabilistic  clustering  and  social  networking  techniques  applied  to  email  are  effective 
at  detecting  potential  insider  threats” .  To  accept  this  hypothesis,  the  techniques  must 
be  valid,  useable,  and  timely. 

Finding  an  appropriate  corpus  to  test  this  hypothesis  is  difficult.  Artificial 
datasets  are  too  small  and  real-world  datasets  are  unavailable  due  to  privacy  concerns. 
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Luckily,  the  release  of  Enron’s  2000  and  2001  email  data  by  the  Federal  Energy 
Regulatory  Commision  during  their  investigation  has  provided  a  valuable  dataset  for 
researchers.  By  using  this  dataset,  it  is  possible  to  test  if  Sherron  Watkins,  Enron’s 
famous  whistleblower,  emerges  as  an  insider.  At  the  same  time,  the  effectiveness 
of  discovering  additional  people  possibly  linked  to  a  known  insider  can  be  tested  by 
checking  if  notables  such  as  Ken  Lay  (Enron’s  chairman),  Jeff  Skilling  (Enron’s  CEO), 
and  Andy  Fastow  (Enron’s  CFO)  emerge  as  having  similar  interests. 

The  experiments  are  run  for  Author  Topic  and  Probabilistic  Latent  Semantic 
Index  with  Users  (PLSI-U).  Experiments  are  run  on  the  Enron  corpus  where  words 
are  first  restricted  to  only  words  in  the  dictionary  and  then  to  all  “words”  whether  or 
not  they  are  in  the  dictionary.  In  all  experiments,  words  are  stemmed  resulting  in  a 
50%  reduction  in  the  number  of  words  considered. 

The  results  show  that  both  Author  Topic  and  PLSI-U  produce  good  results. 
When  the  restriction  of  only  dictionary  words  is  removed,  both  perform  excellently 
at  clustering  words  to  produce  topics  that  are  easily  understandable.  Furthermore, 
both  also  perform  excellently  at  associating  individuals  with  appropriate  topics.  As  a 
result,  if  a  topic  emerges  that  suggests  a  potential  insider  threat,  both  techniques  do 
well  at  uncovering  individuals  interested  in  that  topic.  However,  while  both  perform 
well  at  the  forensic  analysis  of  hireling  associates  of  known  insiders,  only  Author 
Topic  performs  well  on  the  Enron  dataset  at  revealing  potential  insiders  through 
clandestine  interests  by  revealing  Sherron  Watkins  as  an  insider.  She  emerges  as 
having  a  clandestine  interest  in  the  off-the-book  partnerships  and  as  feeling  alienated. 
PLSI-U  finds  too  few  individuals  with  clandestine  interests  to  be  useful.  Finally,  both 
techniques  produce  results  in  a  timely  manner,  taking  less  than  a  week  to  produce 
results  on  a  corpus  of  over  250,000  messages  involving  over  34,000  employees. 

5.2  Future  Work 

The  initial  research  question  was  “Are  probabilistic  clustering  techniques  applied 
to  email  and  internet  activity  effective  at  detecting  potential  insider  threats?”.  This 
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had  to  be  scaled  back  due  to  privacy  concerns  related  to  using  non-publically  available 
email  and  internet  activity  for  a  research  project.  However,  this  resulted  in  the 
overloading  of  email.  In  this  experiment,  email  is  used  to  both  define  the  topics  and 
to  determine  who  is  not  revealing  their  interest  in  a  topic.  On  the  surface  this  should 
result  in  no  clandestine  interests  since  the  only  way  someone  is  considered  to  have  an 
interest  in  a  topic  is  if  they  send  or  receive  an  email  about  it.  The  only  reason  this 
is  not  the  case  is  because  during  the  definition  of  topics  internal  and  external  emails 
are  considered  while  during  the  search  for  clandestine  interests  only  internal  emails 
are  used.  By  using  a  different  data  source  for  generating  topics  of  interest,  more 
clandestine  interests  may  emerge.  Internet  history  is  kept  on  servers  in  the  same 
way  that  email  history  is.  In  addition,  PLSI-U  and  Author  Topic  can  easily  morph 
from  documents  made  up  of  words  to  web  pages  made  up  of  hyperlinks  [?].  While 
internet  activity  was  not  available  for  Enron,  it  is  generally  available  from  the  same 
sources  that  supply  email  history  logs.  By  excluding  the  names  of  individuals  from 
the  email  bodies  and  encoding  the  identities  of  the  individuals  in  the  message  and 
internet  activity  headers,  this  experiment  could  effectively  be  expanded  to  include 
both  email  and  internet  activity. 

An  issue  that  emerged  during  analysis  was  the  unknown  positions  of  many  of 
the  Enron  employees.  Although  the  Internet  and  books  help,  the  jobs  of  a  large 
percentage  of  Enron  employees  remain  unknown.  Future  research  that  assembles  the 
positions  of  all  of  the  Enron  employees  in  the  dataset  would  result  in  much  improved 
data  analysis. 

A  second  problem  that  emerged  during  the  set  up  of  the  experiment  was  the 
number  of  topics.  Although  the  final  number  was  chosen  based  on  hardware  and 
software  considerations,  it  would  have  been  desirable  to  arrive  at  an  optimal  number 
of  topics.  Teh,  et  al  [?]  provide  a  mechanism  for  determining  the  optimal  number  of 
topics  without  having  to  decide  a  priori.  Although  McCallum,  et  al.  [?]  suggest  that 
Teh  is  not  effective  at  performing  the  actual  clustering,  McCallum  still  uses  the  num¬ 
ber  of  topics  suggested  by  Teh.  Performing  this  analysis  on  the  Enron  corpus  would 


provide  a  means  of  finding  a  better  number  of  topics,  possibly  resulting  in  tighter 
clusters  and  better  results  by  one  or  both  of  the  algorithms.  An  alternative  method  is 
to  perform  a  second  level  of  probabilistic  clustering  on  the  subset  of  documents  and 
individuals  associated  with  a  specific  topic.  Unfortunately,  when  this  was  performed 
on  the  California  Crisis  topic,  no  additional  information  was  revealed.  However,  this 
was  a  limited  test  and  it  is  possible  that  if  performed  on  all  of  the  topics,  some  of 
them  might  have  better  results. 

One  question  often  asked  is  whether  too  much  information  is  lost  when  context 
is  taken  away  from  emails  and  replaced  with  word  frequencies.  While  the  results  of 
the  experiment  show  that  the  simplified  model  does  maintain  enough  information  to 
be  useful,  there  is  an  even  simpler,  more  direct  method  of  testing  it.  By  adding  in 
some  domain  knowledge,  it  is  possible  to  see  whether  or  not  by  simply  extracting  user 
and  word  co-occurrence  frequencies,  certain  logical  groupings  (such  as  organizational 
units)  occur.  This  is  very  similar  to  the  traditional  information  term-frequency-inverse 
document  frequency  model.  Since  both  words  and  people  are  represented  explicitly 
in  the  database,  this  can  be  tested  directly  by  database  queries  without  the  need  for 
extensive  C  programming. 

Finally,  two  additional  areas  of  future  research  emerged  during  the  analysis  of 
results.  The  performance  of  Author  Topic  appeared  to  degrade  when  the  restriction 
of  only  dictionary  words  was  removed.  This  was  due  in  large  part  to  the  prevalence 
of  names  as  the  most  probable  words  for  topics.  Attempts  to  exclude  all  names 
(including  those  that  are  found  in  the  dictionary  such  as  “ken”  and  “skill”)  may 
produce  better  results.  Secondly,  the  biggest  problem  found  with  PLSI-U  was  the 
small  size  of  its  implicit  and  explicit  networks.  Finding  a  better  way  to  determine 
who  should  be  included  in  the  PLSI-U  networks  would  overcome  the  biggest  drawback 
to  this  technique. 


99 


5.3  Impact 

This  research  has  developed  a  tool  that  effectively  datamines  large  datasets 
of  email  and  extracts  the  interests  of  individuals.  This  tool  has  proven  effective  at 
revealing  individuals  with  clandestine  interests  who  may  become  insider  threats  and 
at  revealing  clusters  of  people  who  have  the  same,  possibly  questionable,  interests. 
In  the  short  term,  this  tool  can  be  applied  in  a  real  world  setting  as  one  of  several 
tools  for  detecting  potential  insiders,  assisting  management,  and  allocating  its  limited 
resources  at  preventing  potential  insiders  from  becoming  actual  insider  threats. 
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Appendix  A.  Most  Probable  Words 

A.l  PLSI-U  with  only  Dictionary  Words 


0 

i 

assembl 

0.0027719999 

alia 

0.0399390012 

crisi 

0.0023409999 

unknown 

0.0360470004 

ab 

0.0021980000 

varianc 

0.0304809995 

sue 

0.0016960000 

detect 

0.0237600002 

jack 

0.0016570000 

pars 

0.0214889999 

ponder 

0.0015219999 

ancillari 

0.0191290006 

legislatur 

0.0014770000 

award 

0.0191120002 

conserv 

0.0014730000 

attempt 

0.0086970003 

republican 

0.0014560000 

borland 

0.0083670001 

stout 

0.0014140001 

engin 

0.0083590001 

sen 

0.0013040000 

tie 

0.0058269999 

megawatt 

0.0012260000 

manual 

0.0051139998 

parquet 

0.0011549999 

intervent 

0.0050900001 

blackout 

0.0010660000 

download 

0.0050889999 

jean 

0.0010600000 

export 

0.0042679999 

dean 

0.0010370000 

insuf f ici 

0.0031940001 

carter 

0.0010160001 

wheel 

0.0029290000 

f  reez 

0.0010060000 

memori 

0.0027960001 

curt 

0.0009570000 

interchang 

0.0025299999 

deregul 

0.0009360000 

retriev 

0.0024200000 

lynch 

0.0009350000 

match 

0.0018320000 

renew 

0.0008830000 

lock 

0.0010050000 

judg 

0.0008770000 

disk 

0.0009740000 

grid 

0.0008770000 

bad 

0.0007890000 

gate 

0.0008750000 

mead 
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enter 

0.006097000092268 

materi 

0.023755000904202 

credit 

0.005828000139445 

disclosur 

0.023695999756455 

125 


wa 

0.005820000078529 

distribut 

0.023656999692321 

onli 

0.005603000055999 

delet 

0.023532999679446 

time 

0.005594000220299 

reli 

0.023376999422908 

id 

0.005586000159383 

properti 

0.023050999268889 

servic 

0.005464000161737 

bind 

0.023031000047922 

dai 

0.005456000100821 

creat 

0.023025000467896 

todai 

0.005386999808252 

otherwis 

0.023011999204755 

provid 

0.005160999950022 

evid 

0.022952999919653 

guar ante 

0.005135000217706 

administr 

0.022940000519156 

address 

0.005100999958813 

34 

hereto 

0.022933999076486 

35 

market 

0.025608999654651 

god 

0.012435000389814 

price 

0.015150000341237 

bless 

0.010075000114739 

trade 

0.013712000101805 

send 

0.009266000241041 

year 

0.011126999743283 

thi 

0.008414000272751 

product 

0.009843000210822 

life 

0.007474000100046 

month 

0.008558999747038 

address 

0.007277000229806 

week 

0.008524999953806 

list 

0.007168000098318 

sell 

0.007156000006944 

ar 

0.007081000134349 

end 

0.006864999886602 

thei 

0.006643999833614 

demand 

0.006847999989986 

provid 

0.006380999926478 

expect 

0.006659000180662 

peopl 

0.006359999999404 

produc 

0.006591000128537 

read 

0.006337999831885 

term 

0.006401999853551 

faith 

0.006293999962509 

industri 

0.006300000008196 

love 

0.006271999794990 

bui 

0.006231000181288 

prayer 

0.006184999831021 

ga 

0.006128999870270 

believ 

0.006184999831021 

natur 

0.006008999887854 

todai 

0.006118999794126 

high 

0.005752000026405 

daili 

0.005878999829292 

short 

0.005528999958187 

simpli 

0.005857000127435 

suppli 

0.005375000182539 

becaus 

0.005834999959916 

pulp 

0.005375000182539 

lord 

0.005681999959052 

level 

0.005340999923646 

person 

0.005594999995083 

fundament 

0.005307000130415 

king 

0.005594999995083 

spread 

0.005272999871522 

wai 

0.005551000125706 

weather 

0.005135999992490 

36 

onli 

0.005506999790668 

37 

made 

0.014774999581277 

ga 

0.041397001594305 

thi 

0.014592000283301 

deal 

0.032260999083519 

fund 

0.014581999741495 

volum 

0.030396999791265 

bef  or 

0.014550000429153 

subject 

0.029086999595165 

plan 

0.014538999646902 

farmer 

0.022848000749946 
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effort 

0.014484999701381 

dai 

0.020541999489069 

time 

0.014464000239968 

thi 

0.019508000463247 

mani 

0.014441999606788 

forward 

0.018068000674248 

year 

0.014324000105262 

pleas 

0.017015999183059 

ken 

0.014324000105262 

month 

0.016019999980927 

reach 

0.014259999617934 

meter 

0.015465999953449 

set 

0.014228000305593 

nomin 

0.013565000146627 

pleas 

0.014228000305593 

flow 

0.012919000349939 

energi 

0.014228000305593 

deliveri 

0.012586999684572 

stock 

0.014217000454664 

contract 

0.011350999586284 

dure 

0.014217000454664 

transport 

0.010649000294507 

monei 

0.014174000360072 

price 

0.010409000329673 

lost 

0.014131000265479 

chang 

0.010114000178874 

comp  an  i 

0.014131000265479 

invoic 

0.009874000214040 

pai 

0.014120999723673 

show 

0.008859000168741 

state 

0.014099000021815 

fuel 

0.008120999671519 

net 

0.014088000170887 

wa 

0.007935999892652 

lai 

0.014077999629080 

corp 

0.007880999706686 

report 

0.014046000316739 

ticket 

0.007732999976724 

million 

0.014046000316739 

38 

march 

0.007695999927819 

39 

click 

0.015893999487162 

forward 

0.009569000452757 

receiv 

0.011936999857426 

man 

0.007660000119358 

imag 

0.011342000216246 

friend 

0.006275000050664 

onlin 

0.010138999670744 

net 

0.006033000070602 

link 

0.009661000221968 

sai 

0.005456000100821 

dear 

0.009216999635100 

ey 

0.005214999895543 

visit 

0.009065999649465 

woman 

0.005204000044614 

custom 

0.009022999554873 

walk 

0.004794999957085 

servic 

0.008755000308156 

men 

0.004637999925762 

offer 

0.008592000231147 

life 

0.004637999925762 

special 

0.008078999817371 

stop 

0.004375999793410 

privaci 

0.007786000147462 

littl 

0.004271000158042 

repli 

0.006626000162214 

pass 

0.004155000206083 

address 

0.006436000112444 

door 

0.004081999883056 

free 

0.006246999837458 

word 

0.003914000000805 

f eatur 

0.006225000135601 

women 

0.003893000073731 

account 

0.006202999968082 

care 

0.003871999913827 

subscrib 

0.006169000174850 

smile 

0.003724999958649 

regist 

0.006091000046581 

world 

0.003704000031576 

prefer 

0.005764000117779 

hand 

0.003598999930546 

order 

0.005725000053644 

live 

0.003535999916494 

list 

0.005681999959052 

everi 

0.003483999986202 
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web 

0.005656000226736 

ring 

0.003451999975368 

select 

0.005561000201851 

stand 

0.003441999899223 

messag 

0.005198999773711 

40 

turn 

0.003409999888390 

41 

pleas 

0.011548000387847 

park 

0.038644999265671 

travel 

0.010681999847293 

subject 

0.017247999086976 

thi 

0.010681999847293 

thi 

0.013098999857903 

matt 

0.010413000360131 

march 

0.012962999753654 

ticket 

0.010297000408173 

ar 

0.012176999822259 

servic 

0.010123999789357 

net 

0.010115999728441 

avail 

0.009970000013709 

origin 

0.009572999551892 

smith 

0.009930999949574 

pleas 

0.008949999697506 

ar 

0.009200000204146 

back 

0.008759999647737 

flight 

0.008795999921858 

messag 

0.006889000069350 

call 

0.008679999969900 

good 

0.006860999856144 

hotel 

0.008352999575436 

man 

0.006806999910623 

fare 

0.008198999799788 

hunt 

0.006725999992341 

book 

0.007794999983162 

night 

0.006372999865562 

number 

0.007410000078380 

wa 

0.006318999920040 

subject 

0.007294999901205 

work 

0.006020999979228 

airlin 

0.007063999772072 

wai 

0.005640999879688 

free 

0.007044000085443 

deer 

0.005613999906927 

mai 

0.006833000108600 

send 

0.005586999934167 

itinerari 

0.006833000108600 

date 

0.005396999884397 

airport 

0.006736000068486 

friend 

0.005233999807388 

chang 

0.006525000091642 

men 

0.005125999916345 

arriv 

0.006525000091642 

ani 

0.005125999916345 

cancel 

0.006370999850333 

mai 

0.005044000223279 

mat 

0.006312999874353 

42 

call 

0.004962999839336 

43 

veri 

0.006866000127047 

electr 

0.018485000357032 

sai 

0.006401000078768 

commiss 

0.016550999134779 

presid 

0.006248999852687 

state 

0.016462000086904 

seller 

0.006066000089049 

util 

0.015412000007927 

million 

0.005644000135362 

energi 

0.012106000445783 

past 

0.005297999829054 

public 

0.011222000233829 

greet 

0.005059999879450 

legisl 

0.011056000366807 

folk 

0.004801000002772 

regulatori 

0.009584999643266 

billion 

0.004801000002772 

senat 

0.009043999947608 

invest 

0.004757000133395 

f  eder 

0.008933000266552 

perfect 

0.004648999776691 

affair 

0.008689999580383 

san 

0.004552000202239 

regul 

0.008523999713361 
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economi 

0.004271000158042 

governor 

0.008446999825537 

alert 

0.004087000153959 

propos 

0.008070999756455 

direct 

0.003990000113845 

sue 

0.007938000373542 

bond 

0.003945999778807 

consum 

0.007915999740362 

copyright 

0.003902999917045 

gener 

0.006390000227839 

wrote 

0.003892000066116 

committe 

0.006157999858260 

econom 

0.003892000066116 

rate 

0.006014000158757 

cut 

0.003654999891296 

deregul 

0.006002999842167 

thing 

0.003502999898046 

press 

0.005958999972790 

stori 

0.003492000047117 

vote 

0.005638999864459 

big 

0.003492000047117 

rule 

0.005594000220299 

year 

0.003481999970973 

commission 

0.005539000034332 

share 

0.003470999887213 

44 

bill 

0.005516999866813 

45 

cooper 

0.012365999631584 

love 

0.020480999723077 

home 

0.007422999944538 

night 

0.018602000549436 

hous 

0.006802999880165 

hei 

0.018373999744654 

small 

0.005903000012040 

good 

0.018340999260545 

water 

0.005042000208050 

hope 

0.018232999369502 

food 

0.005001999903470 

great 

0.016542999073863 

tast 

0.004602000117302 

home 

0.014785000123084 

cook 

0.004422000143677 

weekend 

0.014737999998033 

land 

0.003881999989972 

realli 

0.013135000132024 

sat 

0.003682000096887 

fun 

0.012249999679625 

bottl 

0.003522000042722 

thing 

0.010390999726951 

minut 

0.003441999899223 

littl 

0.010277000255883 

serv 

0.003421999979764 

friend 

0.010142999701202 

qualiti 

0.003421999979764 

pretti 

0.008995999582112 

front 

0.003402000060305 

nice 

0.008620000444353 

garden 

0.003381999908015 

anywai 

0.008453000336885 

fresh 

0.003342000069097 

hous 

0.008438999764621 

wind 

0.003282000077888 

big 

0.008109999820590 

build 

0.003222000086680 

sound 

0.007956000044942 

onli 

0.003201999934390 

dinner 

0.007922999560833 

high 

0.003122000023723 

sorri 

0.007761999964714 

weather 

0.003102000104263 

gui 

0.007613999769092 

american 

0.003062000032514 

someth 

0.007581000216305 

thing 

0.003021999960765 

leav 

0.007513000164181 

red 

0.003021999960765 

46 

tonight 

0.007493000011891 

47 

thi 

0.009634000249207 

deal 

0.047152001410723 

f ootbal 

0.009565000422299 

chang 

0.022467000409961 
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ar 

0.008351000025868 

subject 

0.021685000509024 

game 

0.007868000306189 

thi 

0.020232999697328 

season 

0.007302000187337 

ha 

0.018924999982119 

week 

0.006914999801666 

confirm 

0.015095000155270 

player 

0.006500999908894 

bui 

0.013435999862850 

team 

0.006115000229329 

check 

0.013388000428677 

plai 

0.006060000043362 

price 

0.013244000263512 

good 

0.005549000110477 

power 

0.012924999929965 

pleas 

0.005259000230581 

trade 

0.012717000208795 

ha 

0.005135000217706 

peak 

0.012621999718249 

f antasi 

0.005107000004500 

sell 

0.011695999652147 

start 

0.005038000177592 

enter 

0.011695999652147 

year 

0.004954999778420 

ar 

0.011536999605596 

onli 

0.004817000124604 

pleas 

0.011505000293255 

sign 

0.004803000018001 

miss 

0.011328999884427 

dai 

0.004445000085980 

show 

0.010308000259101 

wa 

0.004348000045866 

broker 

0.009573999792337 

score 

0.004306999966502 

bill 

0.009494000114501 

run 

0.004265000112355 

deliveri 

0.008553000167012 

big 

0.004224000032991 

time 

0.008392999880016 

pick 

0.004209999926388 

follow 

0.008298000320792 

save 

0.004112999886274 

point 

0.008266000077128 

earli 

0.003905999939889 

hourli 

0.008217999711633 

A. 3  PLSI-U  with  all  Words  (No  Dictionary) 


0 

i 

governor 

0.0031310001 

dbcap 

0.0364030004 

calpin 

0.0030650001 

cpuc 

0.0067199999 

iep 

0.0027960001 

pge 

0.0057259998 

dasovich 

0.0025650000 

workshop 

0.0054040002 

edison 

0.0025589999 

of  0 

0.0052580000 

gov 

0.0024619999 

testimoni 

0.0051150001 

iepa 

0.0020570001 

socalga 

0.0050180000 

duke 

0.0019930000 

gov 

0.0045220000 

mar  a 

0.0019490001 

tp 

0.0041040001 

kaplan 

0.0018820000 

exhibit 

0.0037829999 

cpuc 

0.0018240000 

accord 

0.0037130001 

billion 

0.0016800000 

ii 

0.0033090001 

dynegi 

0.0016470000 

oii 

0.0030650001 

legisl 

0.0016200000 

tariff 

0.0030520000 

see 

0.0015350000 

sempra 

0.0029180001 

press 

0.0015160000 

see 

0.0028009999 

smutni 

0.0014510000 

gmssr 

0.0024309999 
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assembl 

0.0013710000 

ed 

0.0013510000 

Pge 

0.0013420000 

paula 

0.0013120000 

jack 

0.0012980000 

kent 

0.0012840000 

jan 

0.0012830000 

kati 

0.0012660000 

2 

bpa 

0.0085990001 

gov 

0.0064329999 

produc 

0.0054600001 

iep 

0.0043360000 

pngc 

0.0036889999 

disclos 

0.0030189999 

describ 

0.0027900001 

wp 

0.0027230000 

ds 

0.0024699999 

cpuc 

0.0021960000 

mr 

0.0021929999 

written 

0.0020669999 

purpos 

0.0019120000 

caiso 

0.0018810000 

ci 

0.0018720001 

ident 

0.0018670000 

testimoni 

0.0018600000 

peter 

0.0018330000 

occur 

0.0017980000 

jcg 

0.0017320000 

crac 

0.0017140000 

kaplan 

0.0016680000 

repres 

0.0015870000 

oral 

0.0015670001 

text 

0.0015519999 

4 

haa 

0.0040330002 

berkelei 

0.0039470000 

msn 

0.0032269999 

ewe 

0.0023429999 

mwe 

0.0016710000 

chui 

0.0016340000 

Ing 

0.0016320000 

tblload 

0.0022940000 

dynegi 

0.0022460001 

comprehens 

0.0021650000 

core 

0.0019729999 

aelaw 

0.0019660001 

tblintchg 

0.0019290000 

counihan 

0.0017560000 

calpin 

0.0017410000 

3 

wpd 

0.0069550001 

swap 

0.0053249998 

pulp 

0.0033330000 

f orestweb 

0.0032690000 

tonn 

0.0032339999 

paper 

0.0029889999 

div 

0.0029470001 

trust 

0.0027520000 

agmt 

0.0027230000 

f  acil 

0.0024860001 

andrew 

0.0023510000 

shackleton 

0.0023480000 

taf 

0.0021939999 

digest 

0.0020830000 

userref er 

0.0019540000 

produc 

0.0019110000 

kurth 

0.0018840000 

raptor 

0.0018020000 

megarret 

0.0017659999 

monika 

0.0017610000 

hawaii 

0.0016850000 

ii 

0.0015610000 

certif 

0.0015580000 

notif i 

0.0015510001 

top 

0.0015470000 

5 

bid 

0.0224110000 

northwest 

0.0058650002 

shipper 

0.0058610002 

button 

0.0044359998 

screen 

0.0041089999 

packag 

0.0037710001 

mainten 

0.0036970000 
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telex 

0.0016100000 

submit 

0.0035140000 

inmarsat 

0.0016100000 

dth 

0.0034159999 

swbell 

0.0015610000 

station 

0.0031870001 

averag 

0.0013670000 

invoic 

0.0029480001 

york 

0.0013610000 

electron 

0.0027739999 

hoegh 

0.0012180000 

nation 

0.0025120000 

chron 

0.0011660000 

pacif 

0.0024480000 

tk 

0.0011089999 

ref erenc 

0.0023860000 

expedia 

0.0011010000 

receipt 

0.0022789999 

fare 

0.0010470001 

replac 

0.0022239999 

rob 

0.0010430000 

ebb 

0.0021380000 

bank 

0.0010410000 

maximum 

0.0020780000 

tx 

0.0010190000 

prearrang 

0.0020039999 

f amili 

0.0010090000 

cap 

0.0018190000 

trip 

0.0010010001 

repres 

0.0017930000 

rr 

0.0009890000 

saf eti 

0.0017890000 

consum 

0.0009250000 

statement 

0.0017690000 

explor 

0.0009240000 

6 

central 

0.0017680000 

7 

tropic 

0.0020830000 

unif  i 

0.0065009999 

weather 

0.0018130000 

sap 

0.0041230000 

shackleton 

0.0016160000 

netco 

0.0034830000 

folder 

0.0014930000 

sitara 

0.0028359999 

brazil 

0.0014850000 

script 

0.0026769999 

brent 

0.0014780000 

class 

0.0023139999 

econom 

0.0014720000 

setup 

0.0022260000 

synchron 

0.0012660000 

path 

0.0022040000 

low 

0.0012400000 

cst 

0.0021430000 

lynn 

0.0012370000 

regan 

0.0021170001 

kean 

0.0012309999 

logist 

0.0020570001 

educ 

0.0012070000 

prior 

0.0019970001 

storm 

0.0011970000 

scenario 

0.0019670001 

bruce 

0.0010990000 

edi 

0.0018660000 

penn 

0.0010900000 

server 

0.0018170000 

central 

0.0010850000 

invoic 

0.0017910000 

condit 

0.0010720000 

integr 

0.0017420000 

andrea 

0.0010260000 

tammi 

0.0017240000 

environment 

0.0010220000 

terri 

0.0016880000 

hendri 

0.0009980000 

wade 

0.0016070000 

dynegi 

0.0009900000 

entri 

0.0015760000 

exist 

0.0009730000 

estat 

0.0015130000 

upper 

0.0009700000 

login 

0.0015010000 

wind 

0.0009460000 

stage 

0.0014640000 
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hold 

0.0009440000 

8 

patti 

0.0014330000 

9 

tana 

0.0040020002 

children 

0.0026499999 

ew 

0.0032120000 

timesheet 

0.0022060000 

temp 

0.0026060001 

mi  nut 

0.0017300000 

cook 

0.0026030000 

lara 

0.0016300000 

nda 

0.0022900000 

janel 

0.0015700000 

greenberg 

0.0021869999 

Stephen 

0.0015160000 

carol 

0.0020099999 

dub 

0.0014820000 

folder 

0.0019080000 

lexi 

0.0014780000 

swap 

0.0017980000 

maureen 

0.0014759999 

peter 

0.0017860000 

mcvicker 

0.0014290001 

wholesal 

0.0016600000 

session 

0.0014180000 

left 

0.0016430001 

michel 

0.0014149999 

clair 

0.0015930000 

sue 

0.0014080000 

senior 

0.0015580000 

class 

0.0013600000 

add 

0.0013630000 

sa 

0.0012940000 

lesli 

0.0012480000 

frank 

0.0012800000 

enrononlin 

0.0012230000 

leibman 

0.0012350000 

temporari 

0.0012080000 

kean 

0.0012030000 

calendar 

0.0011820000 

sarah 

0.0011830000 

bank 

0.0011549999 

interview 

0.0011820000 

provis 

0.0011130000 

palmer 

0.0011730000 

nymex 

0.0010939999 

sf  0 

0.0011390001 

brent 

0.0010720000 

hu 

0.0011370000 

heard 

0.0010250000 

sylvia 

0.0010990000 

shackleton 

0.0009950000 

10 

elizabeth 

0.0010990000 

11 

rod 

0.0053349999 

prc 

0.0030720001 

hayslett 

0.0039769998 

video 

0.0023230000 

traci 

0.0039340002 

weekli 

0.0020699999 

eott 

0.0032990000 

ken 

0.0017710000 

geaccon 

0.0028150000 

dial 

0.0016400000 

stan 

0.0022849999 
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0.0159777664 

german i 

0.0157330483 

deal 

0.0136963669 

xl 

0.0124964612 

daren 

0.0111623565 

volum 

0.0095756389 

farmer 

0.0093151331 

hpl 

0 . 0074363342 

deliveri 

0.0065364055 

meter 

0.0061969585 

flow 

0.0057627824 

capac 

0.0050602062 

receipt 

0.0050049471 

nomin 

0.0047523356 

sitara 

0.0046418179 

mmbtu 

0.0043576299 

nom 

0.0042155357 

ce 

0.0041129123 

april 

0.0040892302 

transport 

0.0038208300 

lisa 

0.0037813596 

reach 

0.0061844615 

automat 

0.0060449648 

turn 

0.0059984662 

ou 

0.0058705942 

space 

0.0058357199 

de 

0.0056962236 

item 

0.0055451025 

longer 

0.0054521048 

gmt 

0.0054521048 

warn 

0.0054288553 

size 

0.0053707319 

delet 

0.0053242329 

cn 

0.0052544847 

mailbox 

0.0050336150 

ee 

0.0050103660 

inlin 

0.0049289926 

folder 

0.0049057435 

individu 

0.0049057435 

client 

0.0048941188 

limit 

0.0048592445 

button 

0.0048243706 
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game 

0.0126587525 

plai 

0.0072579952 

season 

0.0062541370 

texa 

0.0059630182 

f ootbal 

0.0059128255 

colleg 

0.0048788516 

team 

0.0048386971 

saturdai 

0.0044672694 

austin 

0.0041962280 

player 

0.0040356102 

coach 

0.0038549160 

fan 

0.0038248003 

big 

0.0037143759 

sport 

0.0034634112 

win 

0.0033228712 

ut 

0.0033128324 

ticket 

0.0032425625 

michael 

0.0032425625 

true 

0.0031120609 

sundai 

0.0029414049 

run 

0.0029213279 
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point 

0.0035050656 

dal  la 

0.0028912120 

confirm 

0.0033945481 

bowl 

0.0028008649 

path 

0.0033471833 

save 

0.0028008649 

invoic 

0.0033156069 

St 

0.0027406334 
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Appendix  B.  Most  Probable  Users  and  Explicit  Social  Network 

Statistics 


B.  1  PLSI-  U  with  only  Dictionary  Words 

******************************************************************************************************************* 
CATEGORY  0 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  833  COMPONENTS:  4 

LARGEST  COMPONENT  SIZE:  812  PERCENT  OF  TOTAL  GRAPH:  97.48'/, 
GROUP  DEGREE:  0.25217  GRAPH  DENSITY:  0.00361 

GROUP  CLOSENESS:  0.00364  GROUP  BETWEENNESS:  0.37696 

AVERAGE  p(z I u) :  0.76  STDEVp(zlu):  0.33 


MOST  PROBABLE  USERS 


Topic#  ID# 
0  8431 

0  2222 

0  3132 

0  1475 

0  34229 

0  253 

0  1746 

0  9244 

0  181 

0  8546 

0  801 

0  1489 

0  28783 

0  817 

0  1180 

0  213 

0  1016 

0  7213 

0  3157 

0  37 


Email  Address  Name 

skeanOenroii.com .  Steve  Kean . 

harry.kingerski@enron.com .  Harry  Kingerski 

james ,wright@enron. com . 


dennis . benevidesOenron . com . 

mpalmer@enron.com .  mpalmer@enron.com.  .  . 

jeff.dasovich@enron.com .  Jeff  Dasovich . 

scott .  stoness@enron.com . 

richard.  sandersSenron.  com . 

paul . kauf man@enron .com . 

sandra.mccubbin@enron.com .  Sandra  McCubbin . 

susan.mara@enron.com .  Susan  Mara . 

james . steffes@enron.com . 

roger . yang@enron. com . 


richard.shapiro@enron.com .  Richard  Shapiro 

karen.denne@enron.com .  Karen  Denne.... 

chris.foster@enron.com .  Chris  H  Foster. 

neil.bresnan@enron.com .  Neil  Bresnan.  .  . 

mike.smith@enron.com .  Mike  Smith . 

jennif  er .  rudolph@enron.com . 

tim.belden@enron.com .  Tim  Belden . 


p(z|u) 

0.017804 

0.015902 

0.014326 

0.014257 

0.013434 

0.013172 

0.012924 

0.012769 

0.012533 

0.012356 

0.012340 

0.011824 

0.011751 

0.011482 

0.011450 

0.011375 

0.011177 

0.010670 

0.010646 

0.010578 


34c3fC3(C**************************************************************************************************************** 

CATEGORY  1 

EXPLICIT  SOCIAL  NETWORK  STATISTICS 

AVERAGE  p(z | u) :  0.44  STDEVp(zlu):  0.17 

MOST  PROBABLE  USERS 
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Topic#  ID# 
1  256 

1  20 

1  19 

1  14 

1  12 

1  8 

1  152 

1  219 

1  108 

1  15 

1  IT 

1  79 

1  11 

1  28280 

1  28279 

1  24 

1  21 

1  92 

1  89 

1  16 


Email  Address 

pete . davis@enron . com . 

geir . solberg@enron . com . 

ryan . slinger@enron . com . 

mark . guzman@enron .com . 

craig . deanOenron .com . 

bill.williams@enron.com. . . . 
john.anderson@enron.com. . . . 

michael . mier@enron . com . 

albert.meyers@enron.com. . . . 

leaf . haras in@enron . com . 

bert . meyers@enron . com . 

eric . linder@enron . com . 

monika . causholli@enron . com . 

jbryson@enron . com . 

dporter3@enron . com . 

bill . williams . iii@enron . com 

kate . symes@enron .com . 

holden . salisbury@enron . com . 

greg . wolf e@enron . com . 

steven.merris@enron.com. . . . 


Name  p(z|u) 

Pete  Davis .  0.181147 

Geir  Solberg .  0.089917 

Ryan  Slinger .  0.089820 

Mark  Guzman .  0.089373 

Craig  Dean .  0.085794 

Bill  Williams  III...  0.070370 

John  Anderson .  0.053314 

Michael  Mier .  0.053254 

.  0.052895 

Leaf  Harasin .  0.036543 

Bert  Meyers .  0.036429 

Eric  Linder .  0.032548 

Monika  Causholli. . . .  0.025655 


0.019979 

0.019974 


bill .williams . iii .. .  0.019682 

Kate  Symes .  0.016673 

Holden  Salisbury. . . .  0.009554 

Greg  Wolfe .  0.008244 

Steven  Merris .  0.006087 


******************************************************************************************************************* 
CATEGORY  2 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  619  COMPONENTS:  10 

LARGEST  COMPONENT  SIZE:  585  PERCENT  OF  TOTAL  GRAPH:  94.51"/, 
GROUP  DEGREE:  0.19304  GRAPH  DENSITY:  0.00647 

GROUP  CLOSENESS:  0.00232  GROUP  BETWEENNESS:  0.27696 

AVERAGE  p(z | u) :  0.23  STDEVp(zlu):  0.26 


MOST  PROBABLE  USERS 

Topic#  ID#  Email  Address  Name 

2  4110  klay8enron.com . 

2  1180  karen.denne@enron.com .  Karen  Denne.... 

2  8546  sandra.mccubbin@enron.com .  Sandra  McCubbin 

2  181  paul.kaufmanOenron.com . 

2  253  jeff.dasovichOenron.com .  Jeff  Dasovich.. 

2  2222  harry.kingerskiOenron.com .  Harry  Kingerski 

2  1490  steven.kean@enron.com . 

2  1547  mark.palmer@enron.com . 

2  46859  smara@enron.com .  " . 

2  1489  james.steffes@enron.com . 


p(z|u) 

0.075640 

0.066720 

0 . 049375 

0.039195 

0.038103 

0.035919 

0.033102 

0.032138 

0.031253 

0.028124 
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2 

2 

2 

2 

2 

2 

2 

2 

2 

2 


801  susan.maraOenron.com .  Susan  Mara... 

8431  skeanOenron.com .  Steve  Kean... 

2326  janel.guerreroOenron.com . 

159  david.parquetOenron.com .  David  Parquet 


28654  mona.petrochko@enron.com . 

4851  peggy.mahoneyOenron.com . 

17095  mary.hainOenron.com . 

34229  mpalmerOenron.com .  mpalmer@enron.com... 

46814  jdasovic@enron.com .  "Jeff  Dasovich  ".... 

3495  nicholas . o ’daySenron . com . 


0.025373 

0.024463 

0.023961 

0.020838 

0.020580 

0.018540 

0.015173 

0.014619 

0.014236 

0.013499 


***************************************************************************************************************** 
CATEGORY  3 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 
VERTICES:  384 

LARGEST  COMPONENT  SIZE:  378 
GROUP  DEGREE:  0.17030 
GROUP  CLOSENESS:  0.02125 
AVERAGE  p(z|u):  0.33 


COMPONENTS:  3 

PERCENT  OF  TOTAL  GRAPH:  98. 
GRAPH  DENSITY:  0.00783 
GROUP  BETWEENNESS:  0.63193 
STDEV  p(z|u) :  0.19 


447. 


MOST  PROBABLE  USERS 
Topic#  ID#  Email  Address 

3  256  pete.davis@enron.com . 

3  14  mark.guzman@enron.com . 

3  19  ryan.slinger@enron.com . 

3  20  geir.solberg@enron.com . 

3  12  craig.dean@enron.com . 

3  15  leaf.harasin@enron.com . 

3  17  bert.meyers@enron.com . 

3  79  eric.linder@enron.com . 

3  11  monika.causholli@enron.com.. 

3  24  bill.williams.iii@enron.com. 

3  28279  dporter3@enron.com . 

3  28280  jbryson@enron.com . 

3  8  bill.williams@enron.com . 

3  92  holden.salisbury@enron.com.. 

3  89  greg.wolfe@enron.com . 

3  152  john.anderson@enron.com . 

3  219  michael.mier@enron.com . 

3  108  albert.meyers@enron.com . 

3  21  kate.symes@enron.com . 

3  16  steven.merris@enron.com . 


Name 

Pete  Davis . 

Mark  Guzman . 

Ryan  Slinger . 

Geir  Solberg . 

Craig  Dean . 

Leaf  Harasin . 

Bert  Meyers . 

Eric  Linder . 

Monika  Causholli 
bill .williams . iii . . . 


Bill  Williams  III. . . 
Holden  Salisbury. . . . 

Greg  Wolfe . 

John  Anderson . 

Michael  Mier . 


Kate  Symes . . . 
Steven  Merris 


p(z|u) 

0.157820 

0.080374 

0.079861 

0.079077 

0.073113 

0 . 060740 

0.060712 

0.054025 

0.051216 

0.042756 

0.042026 

0.042018 

0.037072 

0.025247 

0.022114 

0.020014 

0.019979 

0.019555 

0.018203 

0.009220 
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CATEGORY  4 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  1417  COMPONENTS:  12 

LARGEST  COMPONENT  SIZE:  1383  PERCENT  OF  TOTAL  GRAPH:  97.607, 
GROUP  DEGREE:  0.06840  GRAPH  DENSITY:  0.00212 

GROUP  CLOSENESS:  0.00178  GROUP  BETWEENNESS:  0.23742 

AVERAGE  p(z | u) :  0.80  STDEVp(zlu):  0.36 


MOST  PROBABLE  USERS 


Topic#  ID# 
4  367 

4  268 

4  2591 

4  2587 

4  36263 

4  62 

4  360 

4  251 

4  739 

4  53779 

4  2548 

4  85273 

4  302 

4  300 

4  124 

4  2989 

4  19764 

4  447 

4  1333 

4  2247 


Email  Address 

w. .white@enron.com . 

casey . evans@enron . com . 

j im . schwieger@enron . com . 

kevin.ruscitti@enron.com. . . . 

dan . j . hyvl@enron .com . 

j  ohn . postlethwaite@enron . com 

wayne . vinson@enron . com . 

andrea . dahlke@enron . com . 

andrea . ring@enron . com . 

center . ets@enron . com . 

f . . keavey@enron . com . 

tlokey@enron . com . 

paul . lewis@enron . com . 

norman . lee@enron .com . 

chris . stokley@enron . com . 

tim . carter@enron . com . 

limor . nissan@enron . com . 

grant . oh@enron . com . 

zhiyun . yang@enron . com . 

tom . chapman® enr on . com . 


Name  p(z|u) 

Stacey  W.  White .  0.073154 

Casey  Evans .  0.062679 

Jim  Schwieger .  0.047768 

Kevin  Ruscitti .  0.042155 

dan.j.hyvl .  0.026494 

John  Postlethwaite . .  0.026020 
Donald  Wayne  Vinson.  0.020790 

Andrea  Dahlke .  0.019940 

Andrea  Ring .  0.019343 

ETS  Omaha  Solution  C  0.015400 

Peter  F.  Keavey .  0.015216 

WALTER  LOKEY .  0.014185 

Jon  Paul  Lewis .  0.013956 

Norman  Lee .  0.012247 

Chris  Stokley .  0.011201 

Tim  Carter .  0.010998 

.  0.010467 

Grant  Oh .  0.008784 

Zhiyun  Yang .  0.008420 

Tom  Chapman .  0 . 008346 


CATEGORY  5 

EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  1483  COMPONENTS:  7 

LARGEST  COMPONENT  SIZE:  1497  PERCENT  OF  TOTAL  GRAPH:  99.067. 
GROUP  DEGREE:  0.12210  GRAPH  DENSITY:  0.00201 

GROUP  CLOSENESS:  0.00562  GROUP  BETWEENNESS:  0.20771 

AVERAGE  p(z | u) :  0.40  STDEVp(zlu):  0.38 

MOST  PROBABLE  USERS 
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Topic#  ID# 
5  701 

5  1452 

5  37 

5  817 

5  680 

5  782 

5  445 

5  1453 

5  347 

5  2823 

5  253 

5  10241 

5  293 

5  2318 

5  168 

5  2356 

5  1786 

5  1489 

5  1456 

5  1752 


Email  Address 

j  ohn . lavorato@enron . com . 

david.delainey@enron.com. . . . 

tim . beldenOenron . com . 

richard.shapiro@enron.com. . . 

mike . grigsby@enron . com . 

don . black@enron . com . 

rob . milnthorp@enron . com . 

janet.dietrich@enron.com. . . . 

d. .steffes@enron.com . 

kevin . presto@enron . com . 

jeff .dasovich@enron.com . 

phillip . allen@enron . com . 

louise.kitchen@enron.com. . . . 

vicki . sharp@enron . com . 

Christopher . calger@enron . com 

krist in . walsh@enron . com . 

michael.tribolet@enron.com. . 

james . steffes@enron.com . 

dan.leff@enron.com . 

mark . tawney@enron . com . 


Name 

John  Lavorato . 

David  Delainey . 

Tim  Belden . 

Richard  Shapiro . 

Mike  Grigsby . 

Don  Black . 

Rob  Milnthorp . 

Janet  Dietrich . 

James  D.  Steffes.... 


Jeff  Dasovich 


Louise  Kitchen 
Vicki  Sharp... 


Kristin  Walsh 


Dan  Leff 


p(z|u) 

0.067369 

0.043118 

0.039772 

0 . 028748 

0.028571 

0.026678 

0.026577 

0.025799 

0.024981 

0.024093 

0.023118 

0.021235 

0.020737 

0.018829 

0.017476 

0.016847 

0.015361 

0.015189 

0.014888 

0.014572 


******************************************************************************************************************* 
CATEGORY  6 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  1217  COMPONENTS:  3 

LARGEST  COMPONENT  SIZE:  1213  PERCENT  OF  TOTAL  GRAPH:  99. 67'/. 

GROUP  DEGREE:  0.24263  GRAPH  DENSITY:  0.00329 

GROUP  CLOSENESS:  0.02475  GROUP  BETWEENNESS:  0.35841 

AVERAGE  p(z | u) :  0.49  STDEVp(zlu):  0.40 

p(z|u) 
0.302267 
0.058002 
0.035808 
0.025677 
0.023955 
0.023896 
0.021874 
0.021540 
0.019831 
0.019195 


MOST  PROBABLE  USERS 

Topic#  ID#  Email  Address  Name 

6  1093  kay.mann@enroii.com .  Kay  Mann.... 

6  1651  ben.jacoby@enron.com . 

6  154  sheila.tweed@enron.com .  Sheila  Tweed 


6  17542  roseann.engeldorf@enron.com 
6  20027  kathleen.carnahan@enron.com 


6  17261  carlos.sole@enron.com . 

6  2386  heather.kroll@enron.com .  Heather  Kroll 

6  6899  fred.mitro@enron.com .  Fred  Mitro  .  .  . 

6  17208  chris.booth@enron.com . 

6  1568  lisa.bills@enron.com . 
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6 


588  reagan.rorschach3enron.com 


Reagan  Rorschach. . . .  0.016202 


6 

6 

6 

6 

6 

6 

6 

6 

6 


12004  suzanne.adamsSenron.com .  Suzanne  Adams 

190  dale.rasmussenSenron.com . 

2245  ozzie.paganSenron.com .  Ozzie  Pagan.. 

24438  j ohn . schwartzenburgSenron .com . 

1117  david.fairleySenron.com .  David  Fairley 

20018  stuart.zismanSenron.com . 

24415  scott.dieballSenron.com . 

3638  gregg.penmanSenron.com .  Gregg  Penman. 

10237  herman.manisSenron.com . 


0.014484 

0.013117 

0.012718 

0.011801 

0.011529 

0.011102 

0.011082 

0.010857 

0.009950 


***************************************************************************************************************** 
CATEGORY  7 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  1401  COMPONENTS:  8 

LARGEST  COMPONENT  SIZE:  1379  PERCENT  OF  TOTAL  GRAPH:  98.43'/, 
GROUP  DEGREE:  0.14769  GRAPH  DENSITY:  0.00214 

GROUP  CLOSENESS:  0.00293  GROUP  BETWEENNESS:  0.29818 

AVERAGE  p(z | u) :  0.45  STDEVp(zlu):  0.42 


MOST  PROBABLE  USERS 


Topic#  ID# 
7  4135 

7  15296 

7  275 

7  155 

7  1092 

7  318 

7  266 

7  6015 

7  3110 

7  437 

7  3100 

7  2318 

7  5121 

7  17252 

7  17098 

7  3474 

7  14697 

7  445 

7  5621 

7  4940 


Email  Address  Name 

mark . haedickeSenron . com . 

jeffrey.hodgeSenron.com . 

e .. haedickeSenron . com .  Mark  E.  Haedicke.... 

elizabeth.sagerSenron.com .  Elizabeth  Sager . 

travis.mcculloughSenron.com .  Travis  McCullough... 

marcus.nettelton3enron.com .  Marcus  Nettelton .  .  .  . 

janette.elbertsonSenron.com .  Janette  Elbertson.  .  . 

j  ordan .  mintzSenron  .com . 

julia.murraySenron. com . 

peter.keohaneSenron.com .  Peter  Keohane . 

alan . aronowitzSenr on .com . 


vicki.sharpSenron.com .  Vicki  Sharp.. 

eric . thodeSenron.com . 

j  ames . kellerSenron .com . 

janice .mooreSenron. com . 

.schulerSenron.com .  legal . 

barbara.graySenron.com .  Barbara  Gray. 

rob.milnthorpSenron.com .  Rob  Milnthorp 


lance . schuler-legalSenron.com 
rob.wallsSenron.com . 


p(z|u) 

0.127026 

0.075508 

0.049655 

0.042695 

0.040162 

0.034889 

0.020736 

0.016206 

0.015917 

0.014761 

0.014708 

0.013622 

0.013160 

0.011810 

0.011289 

0.010038 

0.009889 

0.009384 

0.009148 

0.007870 
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CATEGORY  8 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  3089  COMPONENTS:  11 

LARGEST  COMPONENT  SIZE:  3050  PERCENT  OF  TOTAL  GRAPH:  98.74'/, 
GROUP  DEGREE:  0.15864  GRAPH  DENSITY:  0.00130 

GROUP  CLOSENESS:  0.00103  GROUP  BETWEENNESS:  0.25931 

AVERAGE  p(z | u) :  0.50  STDEVp(zlu):  0.41 


MOST  PROBABLE  USERS 


Topic#  ID# 
8  18647 

8  280 

8  15273 

8  3101 

8  6815 

8  2365 

8  3098 

8  3404 

8  2390 

8  20682 

8  1100 

8  3113 

8  5889 

8  3103 

8  14787 

8  19792 

8  9117 

8  5897 

8  9094 

8  20033 


Email  Address 

carol . clairSenron . com . 

marie . heard@enron . com . 

dan . hyvl@enr  on .com . 

susan . baileySenron . com . 

debra .  perlingiere@enron .  com 

mary . cook@enron .com . 

Stephanie . panus@enron . com . . 
samantha.boyd@enron.com. . . . 

brent .  hendry@enron  .com . 

cheryl.nelson@enron.com.  .  .  . 
russell . diamondSenron . com . . 
sara. shackletonSenron. com. . 

frank . sayre@enron . com . 

robert . bruce@enron .com . 

jason.williams@enron.com. . . 
taffy.milligan@enron.com. . . 

susan.  flynn@enron .  com . 

mark. taylor@enron . com . 

stacy.dickson@enron.com. . . . 
kaye . ellis@enron. com . 


Name 


Marie  Heard 
Dan  J  Hyvl. 


Debra  Perlingiere. . . 

Mary  Cook . 

Stephanie  Panus . 


Brent  Hendry . . . 
Cheryl  Nelson. . 
Russell  Diamond 


Jason  Williams 


p(z|u) 

0.096720 

0.095487 

0.071036 

0.065159 

0.064973 

0.058524 

0.058013 

0.036209 

0.034352 

0.029334 

0.028116 

0.025853 

0.023844 

0.020908 

0.015493 

0.015034 

0.014653 

0.014076 

0.014028 

0.013844 


CATEGORY  9 

EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  2444  COMPONENTS:  4 

LARGEST  COMPONENT  SIZE:  2433  PERCENT  OF  TOTAL  GRAPH:  99.55'/, 
GROUP  DEGREE:  0.33079  GRAPH  DENSITY:  0.00082 

GROUP  CLOSENESS:  0.00786  GROUP  BETWEENNESS:  0.56920 

AVERAGE  p(z | u) :  0.56  STDEVp(zlu):  0.44 

MOST  PROBABLE  USERS 
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Topic#  ID# 
9  355 

9  619 

9  1488 

9  606 

9  294 

9  613 

9  3099 

9  8306 

9  15198 

9  3409 

9  284 

9  9039 

9  15204 

9  14777 

9  3403 

9  1058 

9  783 

9  506 

9  7667 

9  6591 


Email  Address 

. taylorOenron . com . 

. williams@enron . com . 

perf mgmt@enron . com . 

s .  . theriot@enron .com . 

c . . koehlerOenron .com . 

ellen . wallumrod@enron . com . 

laurel . adamsOenron . com . 

n . . gray@enron . com . 

e . . dicksonOenron . com . 

stacey . richardson@enron . com . 

t .  . hodge@enron . com . 

theresa . brogan@enron . com . 

j  udy . thorne@enr on .com . 

kay .  young@enron .  com . 

cyndie.balfour-flanagan@enron.com 

ann . murphy @enr on .com . 

r . . brackett@enron . com . 

susan . elledge@enron . com . 

ipayit@enron . com . 

aneela . charania@enron . com . 


Name 

legal . 

credit . 

"Performance  Evaluat 

Kim  S.  Theriot . 

Anne  C.  Koehler . 

Ellen  Wallumrod . 


Barbara  N .  Gray . 

Stacy  E.  Dickson.... 


Jeffrey  T.  Hodge.... 


Judy  Thorne 


Melissa  Ann  Murphy. . 
Debbie  R.  Brackett.. 

Susan  Elledge . 

iPayit@Enron . com>@EN 
Aneela  Charania . 


p(z|u) 

0.187008 

0.055386 

0.037612 

0.029576 

0.028529 

0.025875 

0.025411 

0.025397 

0.024387 

0.019492 

0.018528 

0.014218 

0.013367 

0.012102 

0.011474 

0.011186 

0.010655 

0.010414 

0.008657 

0.007675 


******************************************************************************************************************* 
CATEGORY  10 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  1072  COMPONENTS:  4 

LARGEST  COMPONENT  SIZE:  1064  PERCENT  OF  TOTAL  GRAPH:  99. 25°/. 
GROUP  DEGREE:  0.23147  GRAPH  DENSITY:  0.00373 

GROUP  CLOSENESS:  0.01247  GROUP  BETWEENNESS:  0.38826 

AVERAGE  p(z | u) :  0.35  STDEVp(zlu):  0.35 


MOST  PROBABLE  USERS 


Topic#  ID# 

10 

288 

10 

1102 

10 

33 

10 

551 

10 

1142 

10 

8874 

10 

5583 

10 

16417 

10 

5897 

10 

486 

Email  Address 

tana . j  ones@enron .com . 

tom.moran@enron. com . 

Stephanie . sever@enron . com 

lisa.lees@enron.com . 

karen . lambert@enron . com . . 
kelly . lombardi@enron . com . 
frank.davis@enron.com. . . . 
larry.hunter@enron.com. . . 
mark.taylor@enron.com. . . . 
anthony . campos@enron . com . 


Name 

Tana  Jones . 

Tom  Moran . 

Stephanie  Sever 

Lisa  Lees . 

Karen  Lambert . . 
Kelly  Lombardi . 


Larry  Joe  Hunter. . . . 


Anthony  Campos 


p(z|u) 

0.048598 

0.043781 

0.039835 

0.037141 

0.030246 

0.019002 

0.017158 

0.015613 

0.015035 

0.014382 
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10 

10 

10 

10 

10 

10 

10 

10 

10 

10 


1101  tanya.rohauer@enron.com.... 
1449  samuel.schott@enron.com.... 
6098  debbie.brackett@enron.com.. 

571  melissa.murphy@enron.com... 
1044  rhonda.denton@enron.com.... 

480  bob.bowen@enron.com . 

565  kevin.meredith@enron.com... 
18036  wendi.lebrocq@enron.com. . . . 
6706  bernice.rodriguez@enron.com 
4854  william.bradford@enron.com. 


Tanya  Rohauer .  0.013296 

Samuel  Schott .  0.013140 

.  0.012464 

Melissa  Murphy .  0.012326 

Rhonda  Denton .  0.012011 

Bob  Bowen .  0.010622 

Kevin  Meredith .  0.010263 

Wendi  Lebrocq .  0.009937 

Bernice  Rodriguez...  0.009664 
.  0.009508 


****************************************************************************************************************** 
CATEGORY  11 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  1814  COMPONENTS:  7 

LARGEST  COMPONENT  SIZE:  1725  PERCENT  OF  TOTAL  GRAPH:  95.09"/, 
GROUP  DEGREE:  0.20814  GRAPH  DENSITY:  0.00331 

GROUP  CLOSENESS:  0.00064  GROUP  BETWEENNESS:  0.29916 

AVERAGE  p(z I u) :  0.65  STDEVp(zlu):  0.41 


MOST  PROBABLE  USERS 


Topic#  ID# 

ii 

288 

ii 

7158 

ii 

1005 

ii 

1019 

ii 

401 

ii 

4911 

ii 

23696 

ii 

617 

ii 

23690 

ii 

23703 

ii 

18398 

ii 

23706 

ii 

5543 

ii 

23699 

ii 

30596 

ii 

23724 

ii 

29414 

ii 

23705 

ii 

23713 

ii 

503 

Email  Address 

tana. jonesSenron. com . 

mark.greenberg@enron.com. . . 

mark. f isherSenron . com . 

leslie.hansen@enron.com. . . . 

bob . shultsSenron.com . 

hollis  .kimbroughSenron.  com . 

mark . walkerSenr on .com . 

greg. whitingSenron. com . 

jef  f  .  duf  f  Senron.  com . 

kurt .  ander  sonSenron .  com .... 

thomas  .  gr  osSenr  on  .com . 

jef f .maurerSenron . com . 

john. allarioSenron. com . 

kevin.cousineauSenron.com. . 

sarah . wesnerSenron.com . 

j  oe . thorpeSenron .com . 

denis  .  o ’  connellSenron.  com.  . 

bo . thistedSenron .com . 

r onald .  brzezinskiSenron .  com 
daniel.diamondSenron.com. . . 


Name 

Tana  Jones .... 
Mark  Greenberg 


Leslie  Hansen 
Bob  Shults . . . 


Greg  Whiting 


Daniel  Diamond 


p(z|u) 

0.508753 

0.064332 

0 . 040449 

0.032779 

0 . 030348 

0.023842 

0.015119 

0.010684 

0.010491 

0.009879 

0.007953 

0.007285 

0.006756 

0.006632 

0.005849 

0.005811 

0.004942 

0.004807 

0.004792 

0.004184 
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CATEGORY  12 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  204  COMPONENTS:  14 

LARGEST  COMPONENT  SIZE:  129  PERCENT  OF  TOTAL  GRAPH:  63.24'/. 


GROUP 

DEGREE:  0.24411 

GRAPH 

DENSITY : 

0.00493 

GROUP 

CLOSENESS:  0.00297 

GROUP 

BETWEENNESS:  0.24458 

AVERAGE  p(z|u) :  0.55 

STDEV  p(z|u) : 

0.47 

MOST 

PROBABLE  USERS 

Topic#  ID# 

Email  Address 

Name 

p(z|u) 

12 

18633 

e. .brown@enron.com. . . . . 

.  .  .  William  E.  Brown. . 

. .  0.008556 

12 

35765 

1. .johnson@enron.com. . . 

...  David  L.  Johnson. . 

. .  0.008521 

12 

6411 

kean@enron.com . . 

. .  0.007211 

12 

65725 

.  chavez@enron .  com . . 

. .  0.006382 

12 

55759 

. kean@enron . com . 

. .  0.005655 

12 

65733 

.speck@enron.com . . 

. . .  e-mail . 

. .  0.005176 

12 

65705 

. basile@enron . com . 

...  e-mail . 

. .  0.005060 

12 

65706 

.benzon@enron.com . . 

. . .  e-mail . 

. .  0.005060 

12 

65707 

.de@enron.  com . . 

. .  0.005060 

12 

65711 

.jones@enron.com . . 

. . .  e-mail . 

. .  0.005060 

12 

65718 

.rainey@enron.com . . 

...  e-mail . 

. .  0.005060 

12 

65723 

.blizzard@enron.com. . . . 

. .  0.005060 

12 

65728 

. heeg@enron . com . 

. . .  e-mail . 

. .  0.005060 

12 

65729 

.imparato@enron.com. . . . 

. .  0.005060 

12 

65731 

.norris@enron.com . . 

. . .  e-mail . 

. .  0.005060 

12 

65732 

. salvo-shook@enron . com . 

...  e-mail . 

. .  0.005060 

12 

65738 

.barrow@enron.com . . 

. . .  e-mail . 

. .  0.005060 

12 

65740 

.bustillo@enron.com. . . . 

. . .  e-mail . 

. .  0.005060 

12 

65744 

. goss@enron . com . 

. .  0.005060 

12 

65746 

.lehmann@enron.com. . . . , 

. . .  e-mail . 

. .  0.005060 

CATEGORY  13 

EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  4311  COMPONENTS:  16 

LARGEST  COMPONENT  SIZE:  4267  PERCENT  OF  TOTAL  GRAPH:  98.98'/. 
GROUP  DEGREE:  0.18886  GRAPH  DENSITY:  0.00093 

GROUP  CLOSENESS:  0.00087  GROUP  BETWEENNESS:  0.33955 

AVERAGE  p(z | u) :  0.73  STDEVp(zlu):  0.40 

MOST  PROBABLE  USERS 
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Topic#  ID# 

13 

734 

13 

2781 

13 

1078 

13 

412 

13 

2883 

13 

6007 

13 

86 

13 

4648 

13 

84 

13 

7470 

13 

20046 

13 

6242 

13 

19495 

13 

5588 

13 

8401 

13 

6226 

13 

21232 

13 

6227 

13 

34617 

13 

88 

Email  Address 

dut  ch . quigleyOenr on .com . 

enron . announcement s@enr on . com 

40enron@enron . com . 

no . address@enron .com . 

all . houstonOenron . com . 

houston . reportOenron . com . 

all . worldwideOenron . com . 

amelia . alderOenron . com . 

all_ena_egm_eimOenron.com. . . . 

kay . quigleyOenron . com . 

darlene.forsythOenron.com. . . . 

all . downtownOenron . com . 

lola.willisOenron.com . 

all . statesOenron . com . 

sarah . wesner-soongOenron . com . 
enron . announcementOenron . com . 

J . 1 j ohnOenron .com . 

enron . actionOenron . com . 

kmcdaniOenron . com . 

ena . employeesOenron . com . 


Name  p(z|u) 

Dutch  Quigley .  0.136689 

.  0.132588 

Tracey  Ramsey  -  Glob  0.127196 

.  0.081646 

.  0.042026 

.  0.024034 

All  Enron  Worldwide.  0.019420 

.  0.018362 

A11_ENA_EGM_EIM .  0.017372 

Kay  Quigley .  0.008875 

.  0.006007 

All  Enron  Downtown..  0.005775 

.  0.005466 

.  0.005403 

Sarah  Wesner-Soong. .  0.004711 

.  0.004510 

work .  0.003539 

.  0.003288 

.  0.003020 

ENA  Employees .  0.002840 


******************************************************************************************************************* 
CATEGORY  14 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  870  COMPONENTS:  16 

LARGEST  COMPONENT  SIZE:  834  PERCENT  OF  TOTAL  GRAPH:  95.86"/, 
GROUP  DEGREE:  0.22795  GRAPH  DENSITY:  0.00230 

GROUP  CLOSENESS:  0.00234  GROUP  BETWEENNESS:  0.64604 

AVERAGE  p(z | u) :  0.36  STDEVp(zlu):  0.38 


MOST  PROBABLE  USERS 


Topic#  ID# 

Email  Address 

Name 

14 

1419 

billy . lemmonsOenron . com . 

.  Billy 

Lemmons  Jr ... . 

14 

1703 

tim . o 3 rourke@enron . com . 

14 

2780 

kim.melodick@enron. com . 

14 

5517 

sheila . waltonOenron . com . 

14 

5914 

kalen . pieperSenron . com . 

14 

5512 

robert . j  onesOenron . com . 

14 

222 

david . oxleyOenron . com . 

.  David 

Oxley . 

14 

1420 

traci . warnerOenron . com . 

.  Traci 

Warner . 

14 

3454 

exec . j  ones@enron .com . 

14 

5511 

maria . barnardOenron . com . 

p(z|u) 

0.013132 

0.010658 

0.010649 

0.009514 

0.008566 

0.007844 

0.007823 

0.007626 

0.006929 

0.006869 
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14 

14 

14 

14 

14 

14 

14 

14 

14 

14 


5516  cindy.skinner@enron.com.... 
11380  shanna.funkhouser@enron.com 


4934  gary.smith@enron.com . 

2201  cindy.olson@enron.com .  Cindy  Olson.... 

6013  cynthia.barrow@enron.com . 

18669  kirk.mcdaniel@enron.com . 

15227  anne.labbe@enron.com . 

12915  gary.buck@enron.com .  Gary  Buck . 

5653  ted.bland@enron.com . 

1832  jana.giovannini@enron.com .  Jana  Giovannini 


0.006672 

0.006575 

0.006044 

0.005948 

0.005482 

0.005385 

0.005378 

0.004942 

0.004861 

0.004700 


***************************************************************************************************************** 
CATEGORY  15 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  2757  COMPONENTS:  11 

LARGEST  COMPONENT  SIZE:  2718  PERCENT  OF  TOTAL  GRAPH:  98. 59°/. 
GROUP  DEGREE:  0.20953  GRAPH  DENSITY:  0.00109 

GROUP  CLOSENESS:  0.00109  GROUP  BETWEENNESS:  0.33939 

AVERAGE  p(z | u) :  0.81  STDEV  p (z I u) :  0.35 


MOST  PROBABLE  USERS 


Topic#  ID# 

15 

703 

15 

2575 

15 

2592 

15 

453 

15 

2530 

15 

7568 

15 

37122 

15 

737 

15 

1612 

15 

772 

15 

37145 

15 

6943 

15 

709 

15 

73 

15 

37115 

15 

3520 

15 

3580 

15 

2563 

15 

37146 

15 

726 

Email  Address 

matthew.lenhartOenron.com. . . 

j  oe . parksOenron . com . 

m . . scottOenron . com . 

cooper . richeyOenron . com . 

chris . germanyOenron . com . 

monique.sanchez@enron.com. . . 

wollam . erik@enron . com . 

j  ay . reitmeyer@enron . com . 

j. .farmer@enron.com . 

laura.vuittonet@enron.com. . . 
Constantine .brian@enron. com. 
gregory . schockling@enron . com 

a. .martin@enron.com . 

troy.denetsosie@enron.com. . . 

chet . f enner@enron . com . 

ragan . bond@enron .com . 

tiffany.smith@enron.com . 

brad . mckay@enron .com . 

f enner . chet@enron . com . 

michael . olsen@enron . com . 


Name 

Matthew  Lenhart 

Joe  Parks . 

Susan  M.  Scott. 
Cooper  Richey. . 
Chris  Germany. . 
Monique  Sanchez 


Jay  Reitmeyer 


Laura  Vuittonet . 

Brian  Constantine. . . 
Gregory  Schockling. . 
Thomas  A.  Martin.... 
Troy  Denetsosie . 


Ragan  Bond. . . 
Tiffany  Smith 
Brad  Mckay. . . 
Chet  Fenner.. 
Michael  Olsen 


p(z|u) 

0.153960 

0.125052 

0.063080 

0.041295 

0.039502 

0.031292 

0.020544 

0.020303 

0.018191 

0.011607 

0.010156 

0.009379 

0.009292 

0.008784 

0.008253 

0.008083 

0 . 007438 

0.006387 

0.005871 

0.005364 
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CATEGORY  16 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  1608  COMPONENTS:  16 

LARGEST  COMPONENT  SIZE:  1574  PERCENT  OF  TOTAL  GRAPH:  97.89'/, 
GROUP  DEGREE:  0.09905  GRAPH  DENSITY:  0.00249 

GROUP  CLOSENESS:  0.00183  GROUP  BETWEENNESS:  0.41824 

AVERAGE  p(z | u) :  0.44  STDEVp(zlu):  0.40 


MOST  PROBABLE  USERS 


Topic#  ID# 

16 

6031 

16 

21091 

16 

811 

16 

740 

16 

32115 

16 

5068 

16 

8888 

16 

3470 

16 

2530 

16 

21092 

16 

739 

16 

2563 

16 

14675 

16 

6837 

16 

35727 

16 

6580 

16 

5041 

16 

6018 

16 

6992 

16 

14706 

Email  Address 

outlook.team@enron. com . 

jerry.harkreaderOenron.com. . . . 
administration . enronOenron . com 

tina . rodeOenron . com . 

mary . moor eOenron .com . 

audrey.robertsonOenron.com.  .  .  . 
geraldine.irvineOenron.com. . . . 

michael .millerOenron .com . 

Chris . germanyOenron . com . 

jorge . olivaresOenron . com . 

andrea. ringOenron . com . 

br ad . mckay Oenr on .com . 

susan.pereiraOenron.com . 

diane . salcidoOenron . com . 

sigrid.macphersonOenron.com. . . 

angela . barnettOenr on .com . 

jo .matsonOenron. com . 

suzanne.nicholieOenron.com. . . . 

judy . hernandezOenron . com . 

peter . keaveyOenron. com . 


Name 


Jerry  Harkreader. . . . 
Enron  Messaging  Admi 
Tina  Rode . 


Geraldine  Irvine 


Chris  Germany . 

Jorge  Olivares . 

Andrea  Ring . 

Brad  Mckay . 

Susan  Pereira . 

Diane  Salcido . 

Sigrid  Macpherson. . . 
Angela  Barnett . 


Judy  Hernandez 
Peter  Keavey. . 


p(z|u) 

0.041492 

0.002120 

0.002113 

0.002088 

0.001996 

0.001991 

0.001977 

0.001941 

0.001703 

0.001691 

0.001662 

0.001659 

0.001656 

0.001623 

0.001599 

0.001593 

0.001581 

0.001581 

0.001540 

0.001538 


CATEGORY  17 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  3525  COMPONENTS:  5 

LARGEST  COMPONENT  SIZE:  3514  PERCENT  OF  TOTAL  GRAPH:  99.69'/, 
GROUP  DEGREE:  0.19220  GRAPH  DENSITY:  0.00142 

GROUP  CLOSENESS:  0.00880  GROUP  BETWEENNESS:  0.41953 

AVERAGE  p(z | u) :  0.41  STDEVp(zlu):  0.38 

MOST  PROBABLE  USERS 
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Topic#  ID# 

17 

227 

17 

1771 

17 

6399 

17 

10227 

17 

6526 

17 

334 

17 

6396 

17 

3002 

17 

4796 

17 

14781 

17 

5368 

17 

1978 

17 

1543 

17 

215 

17 

15922 

17 

19242 

17 

6398 

17 

3402 

17 

8847 

17 

788 

Email  Address  Name 

sally.beck@enron.com .  Sally  Beck 


shona . wilson@enron . com 
beth . apolloOenron . com . 
brent . price@enron . com . 
mike . j  ordanOenron . com . 


leslie.reeves@enron.com .  Leslie  Reeves 

fernley.dyson@enron.com .  Fernley  Dyson 

m.hall@enron.com .  Bob  M  Hall... 

ted . murphy@enron .com . 


patti . thompson@enron . com . . 
brenda.herod@enron.com. . . . 

greg . piper@enron .com . 

richard.causey@enron.com. . 
stacey.white@enron.com. . . . 

bob . hall@enron . com . 

mary . solmonson@enron . com . . 

chris . abel@enron .com . 

kristin . albrecht@enron . com 


scott.mills@enron.com .  Scott  Mills 

david.port@enron.com .  David  Port. 


p(z|u) 

0.169318 

0.025245 

0.023601 

0.020631 

0.019276 

0.018080 

0.017788 

0.014155 

0.013790 

0.012955 

0.011778 

0.011435 

0.010914 

0.010912 

0.009304 

0 . 008747 

0.008661 

0 . 008348 

0.007933 

0.007924 


******************************************************************************************************************* 
CATEGORY  18 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  655  COMPONENTS:  21 

LARGEST  COMPONENT  SIZE:  487  PERCENT  OF  TOTAL  GRAPH:  74.35"/, 

GROUP  DEGREE:  0.14953  GRAPH  DENSITY:  0.00306 

GROUP  CLOSENESS:  0.00112  GROUP  BETWEENNESS:  0.22792 

AVERAGE  p(z | u) :  0.40  STDEVp(zlu):  0.41 

p(z|u) 
0.013483 
0.011379 
0.011040 
0.009490 
0.009256 
0.009050 
0.008596 
0.007997 
0.007351 
0.007163 


MOST  PROBABLE  USERS 

Topic#  ID#  Email  Address  Name 

18  1418  lexi.elliottaenron.com .  Lexi  Elliott. 

18  6032  kevin.mooreaenron.com . 

18  2088  paulo.issleraenron.com .  Paulo  Issler. 

18  2108  william.smithaenron.com .  William  Smith 


18  4662  celeste.robertsaenron.com. 
18  4661  charlene.jacksonaenron.com 


18  15958  kevin.kindaliaenron.com . 

18  2068  alex.huangaenron.com .  Alex  Huang... 

18  2071  anita.dupontaenron.com .  Anita  Dupont. 

18  1831  kristin.gandyaenron.com .  Kristin  Gandy 
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18 

18 

18 

18 

18 

18 

18 

18 

18 

18 


2079  jose.marquez@enron.com.... 

2072  bob.lee@enron.com . 

2105  tom.halliburton@enron.com. 
2513  elena.chilkina@enron.com.. 
1543  richard.causey@enron.com.. 
8822  kenneth.parkhill@enron.com 

2109  zimin.lu@enron.com . 

2075  gwyn.koepke@enron.com . 

1717  maureen.raymond@enron.com. 
2243  ashley.baxter@enron.com... 


Jose  Marquez .  0.007064 

Bob  Lee .  0.006873 

Tom  Halliburton .  0.006776 

Elena  Chilkina .  0.006719 

.  0.006501 

Kenneth  Parkhill....  0.006418 

Zimin  Lu .  0.006278 

Gwyn  Koepke .  0 . 006249 

.  0.006239 

Ashley  Baxter .  0.005935 


****************************************************************************************************************** 
CATEGORY  19 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  323  COMPONENTS:  16 

LARGEST  COMPONENT  SIZE:  159  PERCENT  OF  TOTAL  GRAPH:  49 . 23°/. 
GROUP  DEGREE:  0.32463  GRAPH  DENSITY:  0.00621 

GROUP  CLOSENESS:  0.00132  GROUP  BETWEENNESS:  0.20860 

AVERAGE  p(z | u) :  0.47  STDEVp(zlu):  0.37 


MOST  PROBABLE  USERS 


Topic#  ID# 

19 

11182 

19 

4754 

19 

11 

19 

2032 

19 

11169 

19 

1874 

19 

11211 

19 

2387 

19 

11227 

19 

1878 

19 

7125 

19 

120 

19 

8849 

19 

11203 

19 

6909 

19 

7537 

19 

6657 

19 

11252 

19 

7106 

19 

6649 

Email  Address 

marcus . edmonds@enron . com . . 

samuel . pak@enron .com . 

monika . causholli@enron . com 
rahul . seksaria@enron . com . . 
brandon . cavazos@enron . com . 

j  ody . crook@enron .com . 

melanie.king@enron.com. . . . 

tara . piazze@enron . com . 

ravi.mujumdar@enron.com. . . 

bryan . hull@enron .com . 

peter.bennett@enron.com. . . 

susan . rance@enron . com . 

robin.rodrigue@enron.com. . 
avinash.jain@enron.com. . . . 

erin . willis@enron . com . 

gisselle . rohmer@enron . com . 
anthony . sexton@enron . com . . 

maria. tef el@enron . com . 

george.thomas@enron.com. . . 
binh . pham@enron . com . 


Name 


Monika  Causholli 
Rahul  Seksaria . 


Jody  Crook 


Tara  Piazze 


Bryan  Hull .... 
Peter  Bennett . 
Susan  Ranee . . . 
Robin  Rodrigue 


Erin  Willis .... 
Gisselle  Rohmer 
Anthony  Sexton. 


George  Thomas 
Binh  Pham. . . . 


p(z|u) 

0.002994 

0.002851 

0.002790 

0.002786 

0.002778 

0 . 002744 

0.002713 

0.002677 

0.002657 

0.002631 

0.002620 

0.002619 

0.002614 

0.002601 

0.002597 

0.002591 

0.002586 

0.002558 

0.002548 

0.002533 
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CATEGORY  20 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  915  COMPONENTS:  16 

LARGEST  COMPONENT  SIZE:  849  PERCENT  OF  TOTAL  GRAPH:  92.797, 
GROUP  DEGREE:  0.20328  GRAPH  DENSITY:  0.00219 

GROUP  CLOSENESS:  0.00131  GROUP  BETWEENNESS:  0.42725 

AVERAGE  p(z | u) :  0.33  STDEVp(zlu):  0.39 


MOST  PROBABLE  USERS 


Topic#  ID# 

20 

581 

20 

696 

20 

14675 

20 

2501 

20 

764 

20 

756 

20 

1676 

20 

2346 

20 

2520 

20 

2515 

20 

2563 

20 

1713 

20 

739 

20 

2497 

20 

747 

20 

516 

20 

719 

20 

2587 

20 

2600 

20 

724 

Email  Address 
vladi . pimenovOenron . com . 
jared.kaiser@enron.com. . 
susan . pereira@enron . com . 
kimberly . bates@enron . com 
judy . townsend@enron. com . 
geof f . storey@enron . com . . 
laura.luce@enron.com. . . . 
bryant . f rihart@enron . com 
tom.donohoe@enron.com. . . 
martin . cuilla@enron . com . 
brad.mckay@enron.com. . . . 

s . . pollan@enron . com . 

andrea.ring@enron.com. . . 
robin.barbe@enron.com. . . 
s. .shively@enron.com. . . . 
chris . gaskill@enron . com . 

1 . . mims@enron .com . 

kevin.ruscitti@enron. com 
maureen . smith@enron . com . 
scott.neal@enron.com. . . . 


Name 

Vladi  Pimenov. 
Jared  Kaiser. . 
Susan  Pereira. 
Kimberly  Bates 
Judy  Townsend. 
Geoff  Storey. . 


Bryant  Frihart 
Tom  Donohoe . . . 
Martin  Cuilla. 
Brad  Mckay. . . . 


Andrea  Ring . 

Robin  Barbe . 

Hunter  S.  Shively... 

Chris  Gaskill . 

Patrice  L.  Mims . 

Kevin  Ruscitti . 

Maureen  Smith . 

Scott  Neal . 


p(z|u) 

0.017949 

0.012259 

0.009862 

0.008906 

0.008533 

0.008341 

0.008216 

0.008215 

0.008072 

0.007621 

0.007609 

0.007258 

0.007102 

0.007091 

0.007038 

0.006917 

0.006884 

0.006723 

0.006334 

0.006252 


CATEGORY  21 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 


VERTICES:  724 

LARGEST  COMPONENT  SIZE:  646 
GROUP  DEGREE:  0.14119 
GROUP  CLOSENESS:  0.00145 
AVERAGE  p(z|u) :  0.30 

MOST  PROBABLE  USERS 


COMPONENTS:  16 

PERCENT  OF  TOTAL  GRAPH:  89.237. 
GRAPH  DENSITY:  0.00277 
GROUP  BETWEENNESS:  0.39513 
STDEV  p (z | u) :  0.38 
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Topic#  ID# 

21 

2521 

21 

10232 

21 

721 

21 

5582 

21 

23807 

21 

15911 

21 

2607 

21 

2536 

21 

2595 

21 

73 

21 

671 

21 

7019 

21 

18195 

21 

2581 

21 

15289 

21 

7111 

21 

14742 

21 

2381 

21 

24914 

21 

1401 

Email  Address 

david . draperOenron . com . 

jim. cof f ey@enron. com . 

daniel.muschar@enron.com. . . 

body . shop@enron . com . 

ge_benef its@enron . com . 

heather . j  ohnson@enron . com . . 
barry . vanderhorst@enron . com 

p. .hewitt@enron.com . 

kristann . shireman@enron . com 
troy.denetsosie@enron.com. . 
keith.dziadek@enron.com. . . . 

j  ason . mcnair@enron . com . 

deborah.bubenko@enron.com. . 
jessica.presas@enron.com. . . 
tommy.yanowski@enron.com. . . 
jeffrey.vincent@enron.com. . 

paul . tate@enron . com . 

richard . ring@enron . com . 

robert .humlicek@enron. com. . 
tina.holcombe@enron.com. . . . 


Name 

David  Draper 


Daniel  Muschar 


Barry  Vanderhorst . . . 

Jess  P.  Hewitt . 

Kristann  Shireman. . . 

Troy  Denetsosie . 

Keith  Dziadek . 

Jason  Mcnair . 


Jessica  Presas 


Jeffrey  Vincent 

Paul  Tate . 

Richard  Ring. . . 


Tina  Holcombe 


p(z|u) 

0.002853 

0.002462 

0.002232 

0.002214 

0.001920 

0.001906 

0.001873 

0.001866 

0.001858 

0.001822 

0.001700 

0.001642 

0.001577 

0.001533 

0.001531 

0.001489 

0.001475 

0.001469 

0.001449 

0.001443 


******************************************************************************************************************* 
CATEGORY  22 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  972  COMPONENTS:  15 

LARGEST  COMPONENT  SIZE:  889  PERCENT  OF  TOTAL  GRAPH:  91.46"/, 
GROUP  DEGREE:  0.60909  GRAPH  DENSITY:  0.00309 

GROUP  CLOSENESS:  0.00108  GROUP  BETWEENNESS:  0.67923 

AVERAGE  p(z | u) :  0.37  STDEVp(zlu):  0.40 


MOST  PROBABLE  USERS 


Topic#  ID# 
22  214 
22  516 
22  262 
22  1757 
22  751 
22  780 
22  2387 
22  2543 
22  223 
22  317 


Email  Address 
dan . dietrichSenron.com. 
chris . gaskillSenron.com 
david . dronetSenron . com . 
colin.tonksSenron.com. . 
bruce . smithSenr on . com . . 
eddie . zhangSenr on . com . . 
tara.piazzeSenron.com. . 
chris.hydeSenron.com.  .  . 

a. .allen3enron.com . 

steve.natSenron.com. . . . 


Name  p(z|u) 

Dan  Dietrich .  0.009126 

Chris  Gaskill .  0.008891 

David  Dronet .  0.007948 

.  0.007760 

Bruce  Smith .  0.005172 

Eddie  Zhang .  0.005133 

Tara  Piazze .  0.005064 

Chris  Hyde .  0.005020 

Thresa  A.  Allen .  0.004516 

Steve  Nat .  0.004496 


177 


22 


0.004337 


281  sonia.hennessy@enron.com .  Sonia  Hennessy . 

22  370  min.zheng@enron.com .  Min  Zheng . 

22  300  norman.lee@enron.com .  Norman  Lee . 

22  665  paige.cox@enron.com .  Paige  Cox . 

22  671  keith.dziadek@enron.com .  Keith  Dziadek . 

22  638  arun.balasundaram@enron.com .  Arun  Balasundaram.  .  . 

22  367  w.  .  white@enron.  com .  Stacey  W.  White . 

22  2599  matt.smith@enron.com .  Matt  Smith . 

22  708  danielle.marcinkowski@enron.com .  Danielle  Marcinkowsk 

22  273  sivakumar.govindasamy@enron.com .  Sivakumar  Govindasam 


************************************************************************: 
CATEGORY  23 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  1589  COMPONENTS:  10 

LARGEST  COMPONENT  SIZE:  1557  PERCENT  OF  TOTAL  GRAPH:  97. 99°/. 
GROUP  DEGREE:  0.16189  GRAPH  DENSITY:  0.00252 

GROUP  CLOSENESS:  0.00175  GROUP  BETWEENNESS:  0.30866 

AVERAGE  p(z | u) :  0.58  STDEVp(zlu):  0.39 


MOST  PROBABLE  USERS 


Topic#  ID# 

23 

2530 

23 

521 

23 

550 

23 

3005 

23 

724 

23 

602 

23 

764 

23 

466 

23 

558 

23 

514 

23 

2537 

23 

604 

23 

565 

23 

14705 

23 

22593 

23 

6071 

23 

9194 

23 

2578 

23 

1695 

23 

2608 

Email  Address 
chris . germany@enron . com . . 
scott . goodell@enron . com . . 
victor . lamadrid@enron . com 
j  oann . collins@enron . com . . 

scott . neal@enron .com . 

robert . superty@enron . com . 
judy . townsend@enron . com . . 
robert . allwein@enron . com . 
melba.lozano@enron.com. . . 
Clarissa. garcia@enron. com 

j  ohn . hodge@enron .com . 

tara.sweitzer@enron.com. . 
kevin . meredith@enron . com . 
dick.jenkins@enron.com. . . 
joan.veselack@enron.com. . 
steve . gillespie@enron . com 
katherine . kelly@enron . com 

w. .pereira@enron.com . 

torrey.moorer@enron.com. . 
victoria . versen@enron . com 


Name 

Chris  Germany. . 
Scott  Goodell.. 
Victor  Lamadrid 
Joann  Collins . . 

Scott  Neal . 

Robert  Superty. 
Judy  Townsend. . 
Robert  Allwein. 
Melba  Lozano. . . 
Clarissa  Garcia 

John  Hodge . 

Tara  Sweitzer. . 
Kevin  Meredith. 
Dick  Jenkins . . . 


Susan  W.  Pereira. . . . 


Victoria  Versen 


0.004261 
0.004216 
0 . 004049 
0.003917 
0.003579 
0.003496 
0 . 003439 
0.003402 
0.003376 

****************************************** 


p(z|u) 

0.053649 

0.024669 

0.023634 

0.017200 

0.015794 

0.015718 

0.015184 

0.014369 

0.013378 

0.012734 

0.012242 

0.011871 

0.011767 

0.011169 

0.010447 

0.010423 

0.009167 

0.008454 

0.008301 

0.007969 
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CATEGORY  24 

EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  3336  COMPONENTS:  10 

LARGEST  COMPONENT  SIZE:  3315  PERCENT  OF  TOTAL  GRAPH:  99.37'/, 
GROUP  DEGREE:  0.09085  GRAPH  DENSITY:  0.00120 

GROUP  CLOSENESS:  0.00246  GROUP  BETWEENNESS:  0.15928 

AVERAGE  p(z | u) :  0.80  STDEVp(zlu):  0.35 


MOST  PROBABLE  USERS 

Topic#  ID#  Email  Address  Name 

24  773  .wardOenroii.com .  houston . 

24  15669  sscott50enron.com . 

24  427  chris.dorlandOenron.com .  Chris  Dorland 

24  4493  lcampbelOenron.com . 

24  1115  clint.deanOenron.com .  Clint  Dean... 

24  2599  matt.smithOenron.com .  Matt  Smith... 

24  3605  t .  .  lucciOenron.  com .  Paul  T.  Lucci 

24  548  jeff.kingOenron.com .  Jeff  King.... 

24  2515  martin.cuillaOenron.com .  Martin  Cuilla 

24  14660  theresa.staabOenron.com . 

24  1769  mark.whittOenron.com . 


24  2553  tori.kuykendallOenron.com .  Tori  Kuykendall 

24  3111  gerald.nemecOenron.com . 

24  4111  jskilliOenron.com . 


24  3688  .davidOenron.com .  e-mail . 

24  16024  mcuillaOenron.com . 

24  939  .scottOenron.com .  e-mail . 

24  2410  .mike0enron.com .  e-mail . 

24  45957  j  .  .bumpOenron.com .  Dan  J.  Bump.... 

24  3548  tyrell.harrisonOenron.com .  Tyrell  Harrison 


p(z|u) 
0.061620 
0.053147 
0.052079 
0 . 043749 
0.041200 
0.036769 
0.030897 
0.021817 
0.018110 
0.016999 
0.012742 
0.012416 
0.012326 
0.010394 
0.008333 
0.008100 
0.007764 
0.007218 
0.006795 
0.006499 


CATEGORY  25 

EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  4141  COMPONENTS:  13 

LARGEST  COMPONENT  SIZE:  4096  PERCENT  OF  TOTAL  GRAPH:  98.91'/, 
GROUP  DEGREE:  0.08522  GRAPH  DENSITY:  0.00097 

GROUP  CLOSENESS:  0.00079  GROUP  BETWEENNESS:  0.16947 

AVERAGE  p(z | u) :  0.51  STDEVp(zlu):  0.42 

MOST  PROBABLE  USERS 
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Topic#  ID# 

25 

1559 

25 

765 

25 

680 

25 

724 

25 

717 

25 

10241 

25 

629 

25 

1769 

25 

712 

25 

687 

25 

761 

25 

673 

25 

2553 

25 

10240 

25 

710 

25 

80 

25 

757 

25 

782 

25 

735 

25 

675 

Email  Address 

j  ohn . arnold@enron . com . 

barry.tycholiz@enron.com. . 
mike.grigsby@enron.com. . . . 

scott . neal@enron .com . 

Stephanie . miller@enron . com 
phillip.allen@enron.com. . . 

k .  . allen@enron . com . 

mark . whitt@enron .com . 

jonathan.mckay@enron.com. . 

keith . holst@enron . com . 

m. .tholt@enron.com . 

frank . ermis@enron . com . 

tori . kuykendall@enron . com . 
hunter . shively@enron . com . . 

larry . may@enron . com . 

john.zufferli@enron.com. . . 
patti . sullivan@enron . com . . 

don . black@enron . com . 

ina . rangel@enron .com . 

l .  . gay@enron . com . 


Name 


Barry  Tycholiz . 

Mike  Grigsby . 

Scott  Neal . 

Stephanie  Miller. . . . 


Phillip  K.  Allen _ 


Jonathan  Mckay. 
Keith  Holst .... 
Jane  M.  Tholt . . 
Frank  Ermis .... 
Tori  Kuykendall 


Larry  May 


Patti  Sullivan 

Don  Black . 

Ina  Rangel .... 
Randall  L .  Gay 


************************************************************************ 
CATEGORY  26 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  2169  COMPONENTS:  11 

LARGEST  COMPONENT  SIZE:  2139  PERCENT  OF  TOTAL  GRAPH:  98.62% 
GROUP  DEGREE:  0.11042  GRAPH  DENSITY:  0.00138 

GROUP  CLOSENESS:  0.00168  GROUP  BETWEENNESS:  0.17887 

AVERAGE  p(z | u) :  0.67  STDEVp(zlu):  0.40 


MOST  PROBABLE  USERS 


Topic#  ID# 
26  2530 
26  1688 
26  2202 
26  2514 
26  3540 
26  1399 
26  1230 
26  2497 
26  8756 
26  3021 


Email  Address 

chris.germany@enron.com. . . . 

ed . mcmichael@enron . com . 

a . . shankman@enron . com . 

ruth.concannon@enron.com. . . 

maria . garza@enron . com . 

eric . boyt@enron . com . 

phil . polsky@enron . com . 

robin . barbe@enron . com . 

margaret.dhont@enron.com. . . 
scott . hendrickson@enron . com 


Name 

Chris  Germany 


Jeffrey  A.  Shankman. 

Ruth  Concannon . 

Maria  Garza . 

Eric  Boyt . 

Phil  Polsky . 

Robin  Barbe . 

Margaret  Dhont . 

Scott  Hendrickson. . . 


p(z|u) 

0.070909 
0.069074 
0.051839 
0 . 043372 
0.035511 
0.034590 
0.027811 
0.025996 
0.025471 
0.018918 
0.018330 
0.016955 
0.016572 
0.015896 
0.011314 
0.011302 
0.010944 
0.010707 
0.010327 
0.009031 

******************************************* 


p(z|u) 

0.106311 

0.054659 

0 . 049366 

0.032437 

0.026319 

0.021885 

0.020944 

0.020038 

0.016281 

0.016172 
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26 


Joe  Parks 


0.015558 


26 

26 

26 

26 

26 

26 

26 

26 

26 


2575  joe.parks@enron.com . 

761  m. . tholt@enron. com . 

2065  mark.breese@enron.com. . . 
11389  garrick.hill@enron.com.. 
3619  louis.dicarlo@enron.com. 

19787  v.weldon@enron.com . 

774  charles.weldon@enron.com 

2549  1 .  .  kelly@enron.  com . 

3468  doug.leach@enron.com. . . . 
17618  mike.mazowita@enron.com. 


Jane  M.  Tholt .  0.014317 

Mark  Breese .  0.014258 

.  0.014108 

Louis  Dicarlo .  0.013896 

.  0.013674 

V .  Charles  Weldon...  0.011411 
Katherine  L.  Kelly. .  0.010852 

.  0.009439 

Mike  Mazowita .  0.009038 


************************************************************************************************* 
CATEGORY  27 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  5208  COMPONENTS:  7 

LARGEST  COMPONENT  SIZE:  5170  PERCENT  OF  TOTAL  GRAPH:  99.277, 
GROUP  DEGREE:  0.19343  GRAPH  DENSITY:  0.00096 

GROUP  CLOSENESS:  0.00102  GROUP  BETWEENNESS:  0.30968 

AVERAGE  p(z | u) :  0.78  STDEV  p (z I u) :  0.37 


MOST  PROBABLE  USERS 


Topic#  ID# 

27 

1654 

27 

1145 

27 

4058 

27 

226 

27 

3441 

27 

3644 

27 

4310 

27 

11250 

27 

607 

27 

488 

27 

1140 

27 

3659 

27 

1691 

27 

19919 

27 

3457 

27 

543 

27 

1657 

27 

2439 

27 

3656 

27 

1999 

Email  Address 

j . kaminski@enron .com . 

ben j  amin . rogersOenron . com . 
jeff.skilling@enron.com. . . 
don.baughman@enron.com. . . . 

kenneth . lay@enron . com . 

kim . ward@enron . com . 

ebass@enron . com . 

e . taylor@enron . com . 

d . . thomas@enron . com . 

mike . carson@enron . com . 

j  oe . stepenovit ch@enron . com 
scott.tholan@enron.com. . . . 

don . miller@enron .com . 

lavorato@enron . com . 

gary.hickerson@enron.com. . 
robert . j  ohnston@enron . com . 
jeff.kinneman@enron.com. . . 
john.brindle@enron.com. . . . 
britt.whitman@enron.com. . . 
j  aime . gualy@enron . com . 


Name 


Benjamin  Rogers 


Don  Baughman  Jr 


Kim  Ward 


Paul  D .  Thomas . 

Mike  Carson . 

Joe  Stepenovitch. . . . 


Robert  Johnston 


Britt  Whitman 
Jaime  Gualy. . 


p(z|u) 

0.092516 

0.074724 

0.046896 

0.042775 

0.041885 

0.018471 

0.018182 

0.017595 

0.016358 

0.016266 

0.016048 

0.015570 

0.015090 

0.014358 

0.014333 

0.012577 

0.011582 

0.011234 

0.010267 

0 . 009446 
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CATEGORY  28 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  1210  COMPONENTS:  10 

LARGEST  COMPONENT  SIZE:  1192  PERCENT  OF  TOTAL  GRAPH:  98.35'/, 
GROUP  DEGREE:  0.10583  GRAPH  DENSITY:  0.00330 

GROUP  CLOSENESS:  0.00338  GROUP  BETWEENNESS:  0.19778 

AVERAGE  p(z | u) :  0.46  STDEVp(zlu):  0.38 


MOST  PROBABLE  USERS 


Topic#  ID# 

28 

2359 

28 

1477 

28 

4132 

28 

403 

28 

404 

28 

401 

28 

1547 

28 

293 

28 

1200 

28 

595 

28 

3032 

28 

4786 

28 

3463 

28 

407 

28 

4131 

28 

1180 

28 

3522 

28 

503 

28 

2168 

28 

3041 

Email  Address 

andy .  zipper@enr on  .com . 

greg.whalley@enron.com. . . . 
john.sherriff@enron.com. . . 
david.forster@enron.com. . . 

rahil . jaf ry@enron . com . 

bob . shults@enron.com . 

mark . palmerSenr on .com . 

louise.kitchen@enron.com. . 
savita.puthigai@enron. com. 

kal .  shah@enr  on  .com . 

sheri.thomas@enron.com. . . . 
dave.samuels@enron.com. . . . 

j  oseph .  hirl@enr  on  .com . 

jennifer.denny@enron.com. . 
j  ay . f itzgerald@enr on . com . . 

kar en . denneSenr on .com . 

michael . bridges@enron. com. 
daniel.diamond@enron.com. . 
meredith.philipp@enron.com 
george.mcclellan@enron.com 


Name 

Andy  Zipper 


Bob  Shults 


Louise  Kitchen. 
Savita  Puthigai 

Kal  Shah . 

Sheri  Thomas . . . 


Karen  Denne . 

Michael  Bridges . 

Daniel  Diamond . 

Meredith  Philipp. . . . 
George  Mcclellan. . . . 


p(z|u) 

0.049795 

0.026279 

0.022504 

0.022438 

0.021806 

0.018842 

0.018281 

0.015590 

0.014777 

0.013226 

0.012491 

0.011941 

0.010571 

0.010295 

0.009952 

0.009583 

0.009314 

0.009080 

0.008983 

0.008740 


CATEGORY  29 

EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  2753  COMPONENTS:  7 

LARGEST  COMPONENT  SIZE:  2739  PERCENT  OF  TOTAL  GRAPH:  99.49'/, 
GROUP  DEGREE:  0.10474  GRAPH  DENSITY:  0.00145 

GROUP  CLOSENESS:  0.00486  GROUP  BETWEENNESS:  0.18919 

AVERAGE  p(z | u) :  0.53  STDEVp(zlu):  0.39 

MOST  PROBABLE  USERS 
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Topic#  ID# 

29 

9244 

29 

1544 

29 

4638 

29 

11 

29 

4940 

29 

17434 

29 

4796 

29 

2381 

29 

1625 

29 

788 

29 

17433 

29 

2370 

29 

3100 

29 

2279 

29 

1460 

29 

481 

29 

20024 

29 

19400 

29 

4790 

29 

72 

Email  Address 

richard . sanders@enron . com . 

rick . buy@enron . com . 

james.derrick@enron.com. . . 
monika . causholli@enron . com 

rob . walls@enron . com . 

britt . davis@enron . com . 

ted . murphy@enron .com . 

richard.ring@enron.com. . . . 

david . gorte@enron . com . 

david . port@enron .com . 

becky . zikes@enron . com . 

vladimir . gorny@enron . com . . 
alan.aronowitz@enron.com. . 
andrew.edison@enron.com. . . 

c . . williams@enron . com . 

s . . bradf  ord@enron . com . 

linda . guinn@enron . com . 

michael . robison@enron . com . 

rex . rogers@enron .com . 

chip.schneider@enron.com. . 


Name 


Monika  Causholli 


Richard  Ring 


David  Port .... 
Becky  Zikes . . . 
Vladimir  Gorny 


Andrew  Edison . 

Robert  C.  Williams.. 
William  S.  Bradford. 


Chip  Schneider 


p(z|u) 

0.101607 

0.060187 

0 . 045886 

0.022408 

0.019351 

0.017488 

0.017237 

0.015333 

0.012697 

0.012687 

0.011415 

0.010935 

0.010412 

0.010259 

0.009466 

0.008875 

0.008756 

0.008493 

0.008313 

0.008296 


******************************************************************************************************************* 
CATEGORY  30 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  8143  COMPONENTS:  16 

LARGEST  COMPONENT  SIZE:  8073  PERCENT  OF  TOTAL  GRAPH:  99 . 14°/. 
GROUP  DEGREE:  0.06028  GRAPH  DENSITY:  0.00086 

GROUP  CLOSENESS:  0.00036  GROUP  BETWEENNESS:  0.06977 

AVERAGE  p(z | u) :  0.87  STDEVp(zlu):  0.27 


MOST  PROBABLE  USERS 


Topic#  ID# 
30  12134 
30  22786 
30  8308 
30  43960 
30  46814 
30  8303 
30  24943 
30  2280 
30  14935 
30  9221 


Email  Address 

kevin . hyatt@enron . com . 

kimberly.watson@enron.com. . 
steven.harris@enron.com. . . . 

lynn . blair@enron .com . 

j  dasovic@enron . com . 

drew . f ossum@enron . com . 

michelle.lokay@enron.com. . . 
shelley.corman@enron.com. . . 

susan . scott@enron . com . 

lorraine . lindberg@enron . com 


Name 

Kevin  Hyatt .... 
Kimberly  Watson 
Steven  Harris . . 


"Jeff  Dasovich  ".... 
Drew  Fossum . 


Shelley  Corman 


p(z|u) 

0.019493 

0.018918 

0.018019 

0.017466 

0.016116 

0.015461 

0.014495 

0.014230 

0.012075 

0.012052 
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30  24866  lindy.donoho@enron.com 
30  34417  alewis@ect.enron.com.. 


0.011587 


Andrew  Lewis .  0.011248 

30  21117  tk.lohman@enron.com .  TK  Lohman .  0.010361 

30  23333  darrell.schoolcraft@enron.com .  0.010153 

30  23672  jeffery.fawcett@enron.com .  0.009416 

30  29453  vkamins@enron.com .  0.009225 

30  2810  larry.campbell@enron.com .  0.007889 

30  46859  smara@enron.com .  " .  0.006995 

30  9096  rick.dietz@enron.com .  0.006940 

30  24902  glen.hass@enron.com .  0.006572 


***************************************************************************************************************** 
CATEGORY  31 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  2950  COMPONENTS:  9 

LARGEST  COMPONENT  SIZE:  2929  PERCENT  OF  TOTAL  GRAPH:  99.29"/, 
GROUP  DEGREE:  0.11836  GRAPH  DENSITY:  0.00136 

GROUP  CLOSENESS:  0.00270  GROUP  BETWEENNESS:  0.20908 

AVERAGE  p(z | u) :  0.59  STDEVp(zlu):  0.38 


MOST  PROBABLE  USERS 


Topic#  ID# 

Email  Address 

Name 

31 

5897 

mark . taylor@enron . com . 

31 

1637 

rod . hayslett@enron . com . 

31 

18299 

tracy . geaccone@enron . com . 

.  Tracy 

Geaccone . 

31 

21042 

just in .boyd@enron . com . 

.  Justin  Boyd . 

31 

3100 

alan . aronowitz@enron . com . 

31 

403 

david.forster@enron.com . 

31 

7573 

j  ames . saunders@enron . com . 

.  James 

Saunders . 

31 

20030 

david . minns@enron . com . 

31 

3114 

paul . simons@enron . com . 

31 

22321 

edmund . cooper@enron . com . 

31 

1698 

susan . musch@enron . com . 

31 

2390 

brent . hendry@enron . com . 

.  Brent 

Hendry . 

31 

20022 

j  ane . mcbride@enron . com . 

31 

20015 

j  ohn . viverito@enron . com . 

31 

573 

dale . neuner@enron . com . 

.  Dale  : 

Neuner . 

31 

5864 

j  anine . j  uggins@enr on . com . 

31 

18391 

mark . evans@enron .com . 

31 

15310 

bob . chandler@enron . com . 

31 

3046 

j  onathan . whitehead@enron . com . 

.  Jonathan  Whitehead. . 

31 

603 

j  ohn . suttle@enron . com . 

.  John 

Suttle . 

p(z|u) 

0.180582 

0.047611 

0.038249 

0.026876 

0.024302 

0.020308 

0.017830 

0.017761 

0.015514 

0.013771 

0.010063 

0.010054 

0.009282 

0.008698 

0.008112 

0.008012 

0.007935 

0.007393 

0.006590 

0.006409 
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CATEGORY  32 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  2781  COMPONENTS:  8 

LARGEST  COMPONENT  SIZE:  2749  PERCENT  OF  TOTAL  GRAPH:  98.85'/, 
GROUP  DEGREE:  0.10896  GRAPH  DENSITY:  0.00216 

GROUP  CLOSENESS:  0.00157  GROUP  BETWEENNESS:  0.26925 

AVERAGE  p(z | u) :  0.53  STDEVp(zlu):  0.39 


MOST  PROBABLE  USERS 


Topic#  ID# 

32 

3111 

32 

242 

32 

1174 

32 

14696 

32 

1465 

32 

284 

32 

20018 

32 

14697 

32 

14717 

32 

17250 

32 

2357 

32 

2551 

32 

11156 

32 

9264 

32 

9484 

32 

1460 

32 

4805 

32 

1645 

32 

1786 

32 

24980 

Email  Address  Name 

ger  aid .  nemecOenron  .com . 

michelle.cashOenron.com .  Michelle  Cash . 

b .  .  Sander  sOenron.  com .  Richard  B.  Sanders.. 

Iisa.rnellencamp0enron.com .  Lisa  Mellencamp . 

twanda . sweetOenron .com . 

t .  .hodgeOenron.  com .  Jeffrey  T.  Hodge.... 

Stuart . zismanOenron . com . 

barbara.grayOenron.com .  Barbara  Gray . 

eric.gillaspieOenron.com .  Eric  Gillaspie . 

Steve  .  hooserOenron  .com . 

brian.redmond0enron.com .  Brian  Redmond . 

mark.knippaOenron.com .  Mark  Knippa . 

sharon.butcherOenron.com .  Sharon  Butcher . 

kriste . sullivanOenron. com . 

shonnie  .  danielOenron .  com . 

c .  .  williamsOenron .  com .  Robert  C.  Williams.. 

dan . lyonsOenron .com . 

chris.hilgertOenron.com . 

michael . triboletOenron.com . 

lizzette . palmerOenron .com . 


p(z|u) 

0.205329 

0.116556 

0.067372 

0.034565 

0.023181 

0.023067 

0.018697 

0.018534 

0.018208 

0.014640 

0.011554 

0.010383 

0.009703 

0.009641 

0.009390 

0.009278 

0.009216 

0.009090 

0.008538 

0.008170 


CATEGORY  33 

EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  4374  COMPONENTS:  15 

LARGEST  COMPONENT  SIZE:  4331  PERCENT  OF  TOTAL  GRAPH:  99.02'/, 
GROUP  DEGREE:  0.11010  GRAPH  DENSITY:  0.00137 

GROUP  CLOSENESS:  0.00094  GROUP  BETWEENNESS:  0.18954 

AVERAGE  p(z | u) :  0.85  STDEVp(zlu):  0.29 

MOST  PROBABLE  USERS 
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Topic#  ID# 

33 

253 

33 

817 

33 

1489 

33 

801 

33 

1490 

33 

181 

33 

347 

33 

813 

33 

818 

33 

36 

33 

2222 

33 

17095 

33 

1474 

33 

8546 

33 

1479 

33 

1180 

33 

28654 

33 

66 

33 

800 

33 

812 

Email  Address 
j  ef f . dasovich@enron . com . . 
richard . shapiroOenron . com 
james.steffes@enron.com. . 

susan . mara@enron .com . 

steven.kean@enron.com. . . . 
paul.kaufman@enron.com. . . 

d. .steffes@enron.com . 

sarah.novosel@enron.com. . 
linda . robertson@enron . com 
alan.comnes@enron.com. . . . 
harry . kingerski@enron . com 

mary . hain@enron . com . 

joe.hartsoe@enron.com. . . . 
sandra . mccubbin@enron . com 
leslie . lawner@enron . com . . 
karen.denne@enron.com. . . . 
mona . petrochko@enron . com . 
steve.walton@enron.com. . . 
ray.alvarez@enron.com. . . . 
1. .nicolay@enron.com . 


Name  p(z|u) 

Jeff  Dasovich .  0.092484 

Richard  Shapiro .  0.059408 

.  0.037731 

Susan  Mara .  0.032505 


0.027141 

0.026614 


James  D.  Steffes....  0.026058 

.  0.019082 

Linda  Robertson .  0.018814 

Alan  Comnes .  0.016450 

Harry  Kingerski .  0.015687 


0.015139 

0.012779 


Sandra  McCubbin .  0.011379 

.  0.010388 

Karen  Denne .  0.010284 

.  0.009778 

Steve  Walton .  0.009620 

Ray  Alvarez .  0 . 009360 

.  0.009146 


******************************************************************************************************************* 
CATEGORY  34 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  1063  COMPONENTS:  10 

LARGEST  COMPONENT  SIZE:  1030  PERCENT  OF  TOTAL  GRAPH:  96. 90°/. 
GROUP  DEGREE:  0.12894  GRAPH  DENSITY:  0.00282 

GROUP  CLOSENESS:  0.00229  GROUP  BETWEENNESS:  0.39638 

AVERAGE  p(z | u) :  0.42  STDEVp(zlu):  0.38 


MOST  PROBABLE  USERS 


Topic#  ID# 

34 

14776 

34 

3105 

34 

3110 

34 

2189 

34 

255 

34 

3100 

34 

15220 

34 

3408 

34 

269 

34 

17561 

Email  Address 

deb . korkmas@enron . com . 

nony . f lores@enron . com . 

julia.murray@enron.com. . . . 
wayne.gresham@enron.com. . . 
angela.davis@enron.com. . . . 
alan.aronowitz@enron.com. . 

lou . stoler@enron .com . 

mary . ogden@enron .com . 

genia . f itzgerald@enron . com 
matt.maxwell@enron.com. . . . 


Name 


Wayne  Gresham 
Angela  Davis. 


Genia  Fitzgerald. . . . 


p(z|u) 

0.014711 

0.012424 

0.011986 

0.011719 

0.010521 

0.010224 

0.009819 

0.009693 

0.009521 

0.009483 
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34  12004  suzanne.adamsSenron.com 


Suzanne  Adams 


0.009145 


34 

34 

34 

34 

34 

34 

34 

34 

34 


4805  dan.lyonsSenron.com . 

20036  brenda.whitehead3enron.com 


20015  john.viveritoSenron.com . 

14696  lisa.mellencampSenron.com .  Lisa  Mellencamp 

18646  ann.whiteSenron.com . 

8238  pat.radfordSenron.com . 

155  elizabeth.sagerSenron.com .  Elizabeth  Sager 

154  sheila.tweedSenron.com .  Sheila  Tweed... 

1465  twanda.sweetSenron.com . 


0.009024 

0.008511 

0.008412 

0.008388 

0.008337 

0.008268 

0.008210 

0.008055 

0.008041 


******************************************************************************************************************* 
CATEGORY  35 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  2420  COMPONENTS:  11 

LARGEST  COMPONENT  SIZE:  2400  PERCENT  OF  TOTAL  GRAPH:  99.177, 
GROUP  DEGREE:  0.15503  GRAPH  DENSITY:  0.00207 

GROUP  CLOSENESS:  0.00301  GROUP  BETWEENNESS:  0.27894 

AVERAGE  p(z | u) :  0.66  STDEV  p (z I u) :  0.39 


MOST  PROBABLE  USERS 


Topic#  ID# 

35 

3113 

35 

155 

35 

1101 

35 

206 

35 

590 

35 

437 

35 

4854 

35 

3029 

35 

144 

35 

1081 

35 

436 

35 

329 

35 

1175 

35 

481 

35 

19925 

35 

1090 

35 

1091 

35 

17099 

35 

1019 

35 

423 

Email  Address 
sara . shackleton@enron . com . 
elizabeth . sagerOenron . com . 
tanya.rohauer@enron.com. . . 
Christian . yoder@enron . com . 
edward.sacks@enron.com. . . . 
peter.keohane@enron.com. . . 
william.bradford@enron.com 
sheila.glover@enron.com. . . 

tracy . ngo@enron . com . 

paul . radous@enron . com . 

greg.johnston@enron.com. . . 

david . portz@enron . com . 

arsystem@mailman . enron . com 

s . . bradf  ord@enron . com . 

mark . e . haedicke@enron . com . 
harlan.murphy@enron.com. . . 

carol . st . @enron . com . 

shari . stack@enron . com . 

leslie.hansen@enron.com. . . 
sharon . crawf  ord@enron . com . 


Name 


Elizabeth  Sager 
Tanya  Rohauer . . 


Edward  Sacks. 
Peter  Keohane 


Sheila  Glover 


Paul  Radous . 

Greg  Johnston . 

David  Portz . 

ARSystem . 

William  S.  Bradford. 


Harlan  Murphy. . 
Carol  St .  Clair 


Leslie  Hansen. . 
Sharon  Crawford 


p(z|u) 

0.323216 

0.099711 

0.028220 

0.027757 

0.020897 

0.020067 

0.019964 

0.019889 

0.019784 

0.017521 

0.016513 

0.015797 

0.014454 

0.011189 

0.011033 

0.010609 

0.010408 

0.010257 

0.010129 

0.008823 
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CATEGORY  36 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  1871  COMPONENTS:  8 

LARGEST  COMPONENT  SIZE:  1850  PERCENT  OF  TOTAL  GRAPH:  98.88'/, 
GROUP  DEGREE:  0.11577  GRAPH  DENSITY:  0.00160 

GROUP  CLOSENESS:  0.00299  GROUP  BETWEENNESS:  0.20861 

AVERAGE  p(z | u) :  0.63  STDEVp(zlu):  0.40 


MOST  PROBABLE  USERS 


Topic#  ID# 
36  226 

36  1126 

36  1120 

36  1104 

36  314 

36  292 

36  591 

36  1140 

36  477 

36  267 

36  618 

36  1048 

36  1096 

36  276 

36  1148 

36  1132 

36  1110 

36  1138 

36  1128 

36  230 


Email  Address 
don.baughman@enron.com. . . . 

tom . may@enron .com . 

juan.hernandez@enron.com. . 
kayne . coulter@enron . com . . . 
jeffrey.miller@enron.com. . 

john. kinser@enron .com . 

eric . saibi@enron.com . 

j  oe . stepenovitch@enr on . com 
robert.benson@enron.com. . . 

joe . errigo@enron. com . 

lloyd . will@enron .com . 

m. .forney@enron.com . 

dean.laurent@enron.com. . . . 
patrick.hanse@enron.com. . . 
gautam.gupta@enron.com. . . . 

bill . rust@enron .com . 

rudy.acevedo@enron.com. . . . 

doug. sewell@enron . com . 

steve.olinde@enron.com. . . . 
corry.bentley@enron.com. . . 


Name  p(z|u) 

Don  Baughman  Jr .  0.018716 

Tom  May .  0.017882 

Juan  Hernandez .  0.016727 

Kayne  Coulter .  0.015195 

Jeffrey  Miller .  0.014335 

John  Kinser .  0.011916 

Eric  Saibi .  0.011054 

Joe  Stepenovitch . . .  .  0.010953 

Robert  Benson .  0.010849 

Joe  Errigo .  0.010644 

Lloyd  Will .  0.010380 

John  M.  Forney .  0.010335 

Dean  Laurent .  0 . 009862 

Patrick  Hanse .  0.009811 

Gautam  Gupta .  0.009177 

Bill  Rust .  0.008977 

Rudy  Acevedo .  0.008481 

Doug  Sewell .  0.008430 

Steve  Olinde  Jr .  0.008328 

Corry  Bentley .  0.007989 


CATEGORY  37 

EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  3099  COMPONENTS:  3 

LARGEST  COMPONENT  SIZE:  3095  PERCENT  OF  TOTAL  GRAPH:  99.87'/, 
GROUP  DEGREE:  0.11962  GRAPH  DENSITY:  0.00161 

GROUP  CLOSENESS:  0.03531  GROUP  BETWEENNESS:  0.30933 

AVERAGE  p(z | u) :  0.48  STDEVp(zlu):  0.40 

MOST  PROBABLE  USERS 
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Topic#  ID# 

37 

293 

37 

701 

37 

367 

37 

222 

37 

756 

37 

747 

37 

1978 

37 

1990 

37 

2357 

37 

365 

37 

709 

37 

34 

37 

743 

37 

403 

37 

2235 

37 

80 

37 

1676 

37 

14653 

37 

3608 

37 

279 

Email  Address 

louise . kitchenOenron . com 

j  ohn . lavoratoOenron . com . 

w. .white@enron.com . 

david.oxley@enron.com. . . 
geof f . storey@enron . com . . 
s. .shively@enron.com. . . . 
greg.piper@enron.com. .  .  . 
harry.arora@enron.com. . . 
brian . redmond@enron . com . 

j  ay . webb@enron . com . 

a. .martin@enron.com . 

f . . calger@enron . com . 

tammie . schoppe@enron . com 
david.forster@enron.com. 
beth.perlman@enron.com. . 
john. zuf f erli@enron. com. 
laura.luce@enron.com. . . . 
Stephen . stock@enron . com . 

j  ean . mrha@enron . com . 

frank.hayden@enron.com. . 


Name 

Louise  Kitchen . 

John  Lavorato . 

Stacey  W.  White . 

David  Oxley . 

Geoff  Storey . 

Hunter  S.  Shively... 


Harry  Arora . 

Brian  Redmond . 

Jay  Webb . 

Thomas  A.  Martin.... 
Christopher  F.  Calge 
Tammie  Schoppe . 


Beth  Perlman 


Frank  Hayden 


************************************************************************: 
CATEGORY  38 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  1532  COMPONENTS:  13 

LARGEST  COMPONENT  SIZE:  1505  PERCENT  OF  TOTAL  GRAPH:  98.24"/, 
GROUP  DEGREE:  0.15242  GRAPH  DENSITY:  0.00131 

GROUP  CLOSENESS:  0.00236  GROUP  BETWEENNESS:  0.29807 

AVERAGE  p(z | u) :  0.36  STDEVp(zlu):  0.38 


MOST  PROBABLE  USERS 


Topic#  ID# 
38  495 
38  1462 
38  1461 
38  1689 
38  1452 
38  608 
38  1600 
38  5073 
38  3457 
38  5062 


Email  Address 

wes  .  colwellSenron .  com . 

kimberly  .hillisSenron.  com. 

kay . chapmanSenr on .com . 

j  ennifer.medcalf3enron.com 
david . delaineySenr on . com . . 
shirley.tijerina3enron.com 
jeff.donahueSenron.com. . . . 
marsha. schillerSenron. com. 
gary.hickersonSenron.com. . 
cathy.phillipsSenron.com. . 


Name 

Wes  Colwell 


David  Delainey . 

Shirley  Tijerina. . . . 


p(z|u) 

0.234172 
0 . 046942 
0.036714 
0.029612 
0.023718 
0.022801 
0.021923 
0.021570 
0.021237 
0.020057 
0.019743 
0.019427 
0.019014 
0.018450 
0.015320 
0.014154 
0.013896 
0.013537 
0.013413 
0.011336 

:****************************************** 


p(z|u) 

0.029936 

0.018186 

0.018063 

0.011990 

0.009455 

0.008885 

0.008758 

0.008513 

0.008457 

0.007726 
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38 


George  McClellan . . . .  0.007609 


38 

38 

38 

38 

38 

38 

38 

38 

38 


3041  george.mcclellanaenron.com 


701  john.lavoratoaenron.com .  John  Lavorato  . 

2357  brian.redmondaenron.com .  Brian  Redmond. 

3484  raymond.bowenaenron.com . 

4654  jennifer.burnsaenron.com . 

1453  janet.dietrichaenron.com .  Janet  Dietrich 

227  sally.beckaenron.com .  Sally  Beck.... 

743  tammie.schoppeaenron.com .  Tammie  Schoppe 

740  tina.rodeaenron.com .  Tina  Rode . 

8357  airam.arteagaaenron.com . 


0.007603 

0.007557 

0.007528 

0.006991 

0.006891 

0.006630 

0.006010 

0.006006 

0.005964 


***************************************************************************************************************** 
CATEGORY  39 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  3714  COMPONENTS:  14 

LARGEST  COMPONENT  SIZE:  3668  PERCENT  OF  TOTAL  GRAPH:  98. 76°/. 
GROUP  DEGREE:  0.10989  GRAPH  DENSITY:  0.00108 

GROUP  CLOSENESS:  0.00082  GROUP  BETWEENNESS:  0.13935 

AVERAGE  p(z | u) :  0.79  STDEV  p (z I u) :  0.36 


MOST  PROBABLE  USERS 


Topic#  ID# 

39 

642 

39 

14935 

39 

795 

39 

707 

39 

6992 

39 

621 

39 

719 

39 

2347 

39 

8773 

39 

1181 

39 

1878 

39 

15321 

39 

8917 

39 

6702 

39 

22081 

39 

15272 

39 

15748 

39 

19886 

39 

6580 

39 

18727 

Email  Address 

eric.bass@enron.com . 

susan . scott@enron . com . 

john.griffith@enron.com . 

mike . maggi@enron .com . 

j  udy . hernandez@enr on .com . 

jason.wolf e@enron. com . 

1 . . mims@enron . com . 

h . . lewis@enron . com . 

michelle . nelson@enron . com . 

exchange . administrator@enron . com 

bryan . hull@enron .com . 

shanna . husser@enron . com . 

timothy . blanchard@enron . com . 

chad . landry@enron . com . 

df armer@enron . com . 

phillip . love@enron . com . 

plove@enron . com . 

leslie . smith@enron . com . 

angela . barnett@enron . com . 

regina . blackshear@enron . com . 


Name 

Eric  Bass 


John  Griffith. . 

Mike  Maggi . 

Judy  Hernandez. 
Jason  Wolfe .... 
Patrice  L.  Mims 
Andrew  H.  Lewis 
Michelle  Nelson 


Bryan  Hull . . . 
Shanna  Husser 


Chad  Landry 


Phillip  Love 


Angela  Barnett . 

Regina  Blackshear. . . 


p(z|u) 

0.075491 

0.041222 

0.032742 

0.032217 

0.030989 

0.028239 

0.020541 

0.019604 

0.017668 

0.017242 

0.016953 

0.013785 

0.013342 

0.011253 

0.010327 

0.009138 

0.008331 

0.008298 

0.008284 

0.008093 
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CATEGORY  40 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 
VERTICES:  532 

LARGEST  COMPONENT  SIZE:  420 
GROUP  DEGREE:  0.36089 
GROUP  CLOSENESS:  0.00151 
AVERAGE  p(z|u) :  0.53 


COMPONENTS:  17 
PERCENT  OF  TOTAL  GRAPH:  78. 
GRAPH  DENSITY:  0.00377 
GROUP  BETWEENNESS:  0.46753 
STDEV  p(zlu) :  0.41 


95% 


MOST  PROBABLE  USERS 


Topic#  ID# 
40  549 

40  602 

40  757 

40  1612 

40  550 

40  6839 

40  2998 

40  3000 

40  579 

40  593 

40  732 

40  514 

40  3023 

40  14674 

40  3003 

40  8755 

40  1400 

40  241 

40  660 

40  482 


Email  Address 

lisa. kinsey@enron . com . 

robert . superty@enr on .com . 

patti . sullivan@enron . com . 

j. .farmer@enron.com . 

victor . lamadridSenron. com . 

darla. saucier@enron. com . 

kirk . lenart@enr on .com . 

tammy . gilmore@enron. com . 

cora.pendergrass@enron.com. . . . 

1.  .schrab@enron.com . 

richard.  pinion@enron .  com . 

Clarissa. garcia@enron. com . 

brandee . j  ackson@enron .com . 

s  .  .  olinger@enron .  com . 

christina.sanchez@enron.com. . . 

mark.mcclure@enron. com . 

donna .  greif  @enr  on  .com . 

suzanne.calcagno@enron.com. . . . 
Suzanne . christiansenSenron . com 
kevin .  brady@enr  on  .com . 


Name  p(z|u) 

Lisa  Kinsey .  0.014421 

Robert  Superty .  0.010874 

Patti  Sullivan .  0.010638 

.  0.008489 

Victor  Lamadrid .  0.008465 

Darla  Saucier .  0.008191 

Kirk  Lenart .  0.007214 

Tammy  Gilmore .  0.006947 

Cora  Pendergrass....  0.006820 

Mark  L.  Schrab .  0.006068 

Richard  Pinion .  0.005953 

Clarissa  Garcia .  0.005689 

Brandee  Jackson .  0.005519 

Kimberly  S.  Olinger.  0.005491 
Christina  Sanchez...  0.005468 

Mark  Mcclure .  0.005446 

Donna  Greif .  0.005408 

Suzanne  Calcagno ....  0 . 004959 
Suzanne  Christiansen  0.004888 
Kevin  Brady .  0.004703 


CATEGORY  41 

EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  2290  COMPONENTS:  8 

LARGEST  COMPONENT  SIZE:  2261  PERCENT  OF  TOTAL  GRAPH:  98.73'/, 
GROUP  DEGREE:  0.23811  GRAPH  DENSITY:  0.00218 

GROUP  CLOSENESS:  0.00169  GROUP  BETWEENNESS:  0.34927 

AVERAGE  p(z | u) :  0.40  STDEV  p(zlu):  0.36 

MOST  PROBABLE  USERS 
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Topic#  ID# 

41 

15554 

41 

15921 

41 

11079 

41 

14684 

41 

11108 

41 

1777 

41 

11148 

41 

1612 

41 

9017 

41 

18648 

41 

11138 

41 

11094 

41 

11063 

41 

18649 

41 

14687 

41 

7637 

41 

11065 

41 

11128 

41 

15753 

41 

20240 

Email  Address  Name 

daren . f  armerOenr on .com . 

pat . clynesOenron .com . 

melissa.graves@enron.com .  Melissa  Graves. 

robert.cotten@enron.com .  Robert  Cotten.  . 

julie.meyers@enron.com .  Julie  Meyers... 

rita . wynne@enron .com . 

george.weissman@enron.com .  George  Weissman 


j . .farmer@enron.com . 

lauri . allen@enron . com . 

vance . taylor@enron . com . 

edward.terry@enron.com .  Edward  Terry . 

gary.lamphier@enron.com .  Gary  Lamphier . 

howard.camp@enron.com .  Howard  Camp . 

donald . reinhardt@enron . com . 

edward.gottlob@enron.com .  Edward  Gottlob . 

susan.smith@enron.com .  Susan  Smith . 

clem.cernosek@enron.com .  Clem  Cernosek . 

carlos.rodriguez@enron.com .  Carlos  Rodriguez.... 

j  ackie . young@enron . com . 

robert . lloyd@enron . com . 


************************************************************************ 
CATEGORY  42 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  2748  COMPONENTS:  10 

LARGEST  COMPONENT  SIZE:  2628  PERCENT  OF  TOTAL  GRAPH:  95 . 63°/. 
GROUP  DEGREE:  0.08948  GRAPH  DENSITY:  0.00182 

GROUP  CLOSENESS:  0.00043  GROUP  BETWEENNESS:  0.09911 

AVERAGE  p(z | u) :  0.42  STDEVp(zlu):  0.40 


MOST  PROBABLE  USERS 


Topic#  ID# 

42 

706 

42 

698 

42 

713 

42 

14765 

42 

678 

42 

679 

42 

15272 

42 

644 

42 

8849 

42 

718 

Email  Address 

m. .love@enron.com . 

kam . keiser@enron .com . 

errol . mclaughlin@enron . com 
darron.giron@enron.com. . . . 

c . . giron@enron . com . 

c . . gossett@enron .com . 

phillip.love@enron.com. . . . 
david.baumbach@enron.com. . 
robin.rodrigue@enron.com. . 
bruce . mills@enron . com . 


Name 

Phillip  M.  Love . 

Kam  Keiser . 

Errol  McLaughlin  Jr. 


Darron  C.  Giron . 

Jeffrey  C.  Gossett.. 

Phillip  Love . 

David  Baumbach . 

Robin  Rodrigue . 

Bruce  Mills . 


p(z|u) 

0.059135 

0.015067 

0.012731 

0.009928 

0.008283 

0.007517 

0.007224 

0.007084 

0.006958 

0.006821 

0.006820 

0.006220 

0.005986 

0.005897 

0.005801 

0.005795 

0.005452 

0.005397 

0.005229 

0.005144 

******************************************* 


p(z|u) 

0.056662 

0.055671 

0.043189 

0.032354 

0.031285 

0.024133 

0.021333 

0.016250 

0.016190 

0.012725 
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42  6266  jeffrey.gossett@enron.com 

42  730  scott.palmer@enron.com... 

42  3019  anne.bike@enron.com . 

42  777  d. . winf ree@enron. com . 

42  767  john.valdes@enron.com.... 

42  452  kathy.reeves@enron.com... 

42  2520  tom.donohoe@enron.com.... 

42  6878  greg.couch@enron.com . 

42  23910  mary.fischer@enron.com... 

42  303  s . . lim@enron. com . 


Jeffrey  C  Gossett...  0.012169 

B.  Scott  Palmer .  0.011753 

Anne  Bike .  0.011640 

Neal  D .  0.011403 

John  Valdes .  0.009893 

Kathy  Reeves .  0.008306 

Tom  Donohoe .  0.007935 

Greg  Couch .  0.007896 

Mary  Fischer .  0.007456 

Francis  S.  Lim .  0.007159 


****************************************************************************************************************** 
CATEGORY  43 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 
VERTICES:  517 

LARGEST  COMPONENT  SIZE:  418 
GROUP  DEGREE:  0.36939 
GROUP  CLOSENESS:  0.00165 
AVERAGE  p(z|u) :  0.28 


COMPONENTS:  22 
PERCENT  OF  TOTAL  GRAPH:  80. 
GRAPH  DENSITY:  0.00388 
GROUP  BETWEENNESS:  0.55721 
STDEV  p(z|u) :  0.35 


85% 


MOST  PROBABLE  USERS 


Topic#  ID# 
43  11132 
43  14687 
43  18645 
43  11074 
43  10232 
43  11055 
43  11077 
43  4973 
43  19231 
43  11066 
43  1577 
43  20310 
43  15687 
43  11088 
43  1777 
43  10242 
43  11072 
43  15578 
43  9146 
43  11064 


Email  Address 
tom.shelton@enron.com. . . . 
edward .  gottlob@enr on .  com . 
Steve .  schneiderSenron.  com 
michael . eiben@enron . com . . 

jim. coff ey@enron. com . 

brad.blevins@enron.com. . . 
irene.flynn@enron.com.  .  .  . 
lillian. carrollSenron. com 
karry.kendall@enron.com. . 
nick. cocavessis@enron. com 
carol.carter@enron.com. . . 
emma.welsch@enron.com. . . . 
james.mckay@enron.com. . . . 
nathan . hlavaty@enr on . com . 

rita. wynne@enron. com . 

thomas.martin@enron.com. . 
cheryl .  dudleySenron .  com .  . 
janet.wallis@enron.com. . . 
james.haden@enron.com. . . . 
molly . carr iere@enr on . com . 


Name 

Tom  Shelton. . . 
Edward  Gottlob 


Michael  Eiben 


Brad  Blevins 
Irene  Flynn. 


Nick  Cocavessis 


Nathan  Hlavaty 


Cheryl  Dudley 


Molly  Carriere 


p(z|u) 

0.004754 

0.004671 

0.004615 

0.004143 

0.003795 

0.003472 

0.003461 

0.003259 

0.003234 

0.003201 

0.003179 

0.003177 

0.003099 

0.003067 

0.002883 

0.002840 

0.002760 

0.002752 

0.002712 

0.002711 
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CATEGORY  44 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  3600  COMPONENTS:  8 

LARGEST  COMPONENT  SIZE:  3570  PERCENT  OF  TOTAL  GRAPH:  99. 17°/. 
GROUP  DEGREE:  0.15460  GRAPH  DENSITY:  0.00139 

GROUP  CLOSENESS:  0.00142  GROUP  BETWEENNESS:  0.22946 

AVERAGE  p(z | u) :  0.64  STDEVp(zlu):  0.39 


MOST  PROBABLE  USERS 


Topic#  ID# 
44  8 

44  21 

44  37 

44  59 

44  124 

44  42 

44  54 

44  47 

44  57 

44  55 

44  14 

44  53 

44  52 

44  60 

44  93 

44  38 

44  92 

44  175 

44  58 

44  145 


Email  Address 
bill.williams@enron.com. . . 

kate . symes@enron .com . 

tim . belden@enron . com . 

diana.scholtes@enron.com. . 
chris.stokley@enron.com. . . 
jeff.richter@enron.com. . . . 
sean.crandall@enron.com. . . 
cara.semperger@enron.com. . 

matt . motley@enron . com . 

robert.badeer@enron.com. . . 

mark . guzman@enron .com . 

mark.fischer@enron.com. . . . 

tom . alonso@enron .com . 

mike.swerzbin@enron.com. . . 
phillip . platter@enron . com . 
chris.mallory@enron.com. . . 
holden . salisbury@enron . com 

lisa.gang@enron.com . 

p . . o } neil@enron . com . 

Stewart . rosman@enron . com . . 


Name  p(z|u) 

Bill  Williams  III...  0.040777 

Kate  Symes .  0.040168 

Tim  Belden .  0.031482 

Diana  Scholtes .  0.030788 

Chris  Stokley .  0.027118 

Jeff  Richter .  0.025783 

Sean  Crandall .  0.025680 

Cara  Semperger .  0.022904 

Matt  Motley .  0.022533 

Robert  Badeer .  0.022190 

Mark  Guzman .  0.022096 

Mark  Fischer .  0.021644 

Tom  Alonso .  0.021624 

Mike  Swerzbin .  0.021233 

Phillip  Platter .  0.020663 

Chris  Mallory .  0.018266 

Holden  Salisbury....  0.016172 

.  0.015194 

Murray  P.  Neil .  0.014506 

Stewart  Rosman .  0.010287 


CATEGORY  45 

EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  4800  COMPONENTS:  8 

LARGEST  COMPONENT  SIZE:  4777  PERCENT  OF  TOTAL  GRAPH:  99.527, 
GROUP  DEGREE:  0.18445  GRAPH  DENSITY:  0.00083 

GROUP  CLOSENESS:  0.00222  GROUP  BETWEENNESS:  0.29961 

AVERAGE  p(z | u) :  0.53  STDEVp(zlu):  0.39 

MOST  PROBABLE  USERS 
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Topic#  ID# 
45  1490 
45  4769 
45  1454 
45  1463 
45  3536 
45  1477 
45  3538 
45  3441 
45  2201 
45  3444 
45  1543 
45  3443 
45  4140 
45  2176 
45  3484 
45  4664 
45  1637 
45  1180 
45  4058 
45  2287 


Email  Address  Name 

steven . kean@enron . com . 

Stanley . horton@enron . com . 

j . . kean@enron . com .  Steven  J.  Kean. 

maureen . mcvickerOenron . com . 

rosalee.fleming@enron.com .  Rosalee  Fleming 

greg . whalley@enron . com . 

mark.frevert@enron.com .  Mark  Frevert... 

kenneth . lay@enron . com . 

cindy.olson@enron.com .  Cindy  Olson.... 


j  ef f rey . mcmahon@enron . com . 

richard . causey@enron . com . 

mark . koenig@enron . com . 

cindy . stark@enron . com . 

jim.fallon@enron.com .  Jim  Fallon 

raymond . bowen@enron . com . 

sherri . sera@enron . com . 

rod . hayslett@enron . com . 


karen.denne@enron.com .  Karen  Denne... 

jeff . skilling@enron. com . 

mike.mcconnell@enron.com .  Mike  Mcconnell 


p(z|u) 

0 . 057847 
0.039712 
0.024714 
0.023896 
0.021112 
0.017872 
0.015626 
0.015614 
0.014615 
0.013126 
0.011493 
0.010834 
0.010079 
0.009690 
0.009399 
0 . 009346 
0.009284 
0.008729 
0.008640 
0.008524 


******************************************************************************************************************* 
CATEGORY  46 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  2551  COMPONENTS:  9 

LARGEST  COMPONENT  SIZE:  2529  PERCENT  OF  TOTAL  GRAPH:  99.14"/, 
GROUP  DEGREE:  0.31489  GRAPH  DENSITY:  0.00118 

GROUP  CLOSENESS:  0.00267  GROUP  BETWEENNESS:  0.53932 

AVERAGE  p(z | u) :  0.44  STDEVp(zlu):  0.40 


MOST  PROBABLE  USERS 


Topic#  ID# 
46  584 

46  519 

46  499 

46  1642 

46  618 

46  2206 

46  601 

46  2381 

46  516 

46  63 


Email  Address 

m. .presto@enron.com . 

doug.gilbert-smith@enron. com 

dana. davis@enron. com . 

rogers.herndon@enron.com. . . . 

lloyd.will@enron.com . 

Stacey . bolton@enron . com . 

j . . sturm@enr on .com . 

richard. ring@enron. com . 

chris . gaskill@enron.com . 

elliot.mainzer@enron.com. . . . 


Name  p(z|u) 

Kevin  M.  Presto .  0.072968 

Doug  Gilbert-smith. .  0.040231 

Mark  Dana  Davis .  0.037652 

.  0.031286 

Lloyd  Will .  0.027930 

Stacey  Bolton .  0.023560 

Fletcher  J.  Sturm...  0.022593 

Richard  Ring .  0.017835 

Chris  Gaskill .  0.016085 

Elliot  Mainzer .  0.015746 
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46  14954  mary.schoen@enron.com 


0.013194 


46  2538  kelly.holman@enron.com .  Kelly  Holman . 

46  56  tim.heizenrader@enron.com .  Tim  Heizenrader . 

46  812  1 . .nicolay@enron. com . 

46  498  mike.curry@enron.com .  Mike  Curry . 

46  5956  michael.terraso@enron.com . 

46  95  center.dl-portland@enron.com .  DL-Portland  World  Tr 

46  2242  lisa.jacobson@enron.com .  Lisa  Jacobson . 

46  37  tim.belden@enron.com .  Tim  Belden . 

46  347  d.  .  stef  fes@enron.  com .  James  D.  Steffes.... 


sit***********************************************************************: 

CATEGORY  47 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  2855  COMPONENTS:  9 

LARGEST  COMPONENT  SIZE:  2835  PERCENT  OF  TOTAL  GRAPH:  99.30"/, 
GROUP  DEGREE:  0.14328  GRAPH  DENSITY:  0.00140 

GROUP  CLOSENESS:  0.00328  GROUP  BETWEENNESS:  0.33934 

AVERAGE  p(z | u) :  0.59  STDEVp(zlu):  0.39 


MOST  PROBABLE  USERS 


Topic#  ID# 
47  5335 
47  4134 
47  2287 
47  2058 
47  4807 
47  2106 
47  30143 
47  2102 
47  2109 
47  4654 
47  20277 
47  1666 
47  1388 
47  6033 
47  4664 
47  4785 
47  2069 
47  1748 
47  1755 
47  15290 


Email  Address 

vince . kaminski@enr on .com . 

jeffrey . shankmanOenron. com . 

mike ,mcconnell@enron .  com . 

Shirley .  crenshawOenr  on  .com . 

Stinson.  gibnerSenron .  com . 

vasant . shanbhogue@enron .com . 

vince . j . kaminskiOenr on . com . 

tanya . tamarchenkoOenron . com . 

zimin .  luOenr  on  .com . 

j  ennif er ,burns@enron  .com . 

grant . massonOenron .com . 

pinnamaneni . krishnaraoOenron. com 

christie.patrick3enron.com . 

mike . robertsOenron . com . 

Sherri .  seraSenr  on  .com . 

j  ohn .  nowlan@enr  on  .com . 

amitava . dhar@enron .com . 

dale . surbey@enr  on .com . 

ravi.thuraisingham@enron.com. . . . 
molly .  magee@enr  on  .com . 


Name 


Mike  Mcconnell . 

Shirley  Crenshaw. . . . 


Vasant  Shanbhogue . . . 
Vince  J"  "Kaminski.. 
Tanya  Tamarchenko. . . 
Zimin  Lu . 


Christie  Patrick. . . . 


Amitava  Dhar 


0.012776 

0.010846 

0.010817 

0.010795 

0.010633 

0.010536 

0.009560 

0.009261 

0.008804 

****************************************** 


p(z|u) 

0.340735 

0.061597 

0.052482 

0.049972 

0 . 040387 

0.026319 

0.017587 

0.015234 

0.015077 

0.014087 

0.012089 

0.012013 

0.011853 

0.010805 

0.009806 

0.009706 

0.008304 

0.007335 

0.006944 

0.006293 
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B.2  Author  Topic  with  only  Dictionary  Words 


CATEGORY  0 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 
VERTICES:  6273 
LARGEST  COMPONENT  SIZE:  6252 
GROUP  DEGREE:  0.19910 
GROUP  CLOSENESS:  0.00248 
AVERAGE  p(z|u) :  0.00 


COMPONENTS:  10 

PERCENT  OF  TOTAL  GRAPH:  99.67'/. 
GRAPH  DENSITY:  0.00096 
GROUP  BETWEENNESS:  0.26975 
STDEV  p(z|u) :  0.00 


MOST  PROBABLE  USERS 


Topic#  ID# 
0  36 

0  37 

0  256 

0  403 

0  1651 

0  2222 

0  8 

0  34 

0  41 

0  42 

0  51 

0  58 

0  62 

0  66 

0  67 

0  72 

0  89 

0  99 

0  124 

0  144 


Email  Address 

alan . comnes@enron . com . 

tim . beldenOenron . com . 

pete . davis@enron . com . 

david.forster@enron.com . 

ben . j  acoby@enron .com . 

harry.kingerski@enron.com. . . 

bill . williams@enron. com . 

f . . calger@enron . com . 

h. .foster@enron.com . 

j  ef f . richter@enron . com . 

debra.davidson@enron.com. . . . 

p. . o ’neil@enron. com . 

j  ohn . postlethwaite@enron . com 

Steve . walton@enron . com . 

dave . perrino@enron . com . 

chip.schneider@enron.com. . . . 

greg. wolf e@enron. com . 

samantha . law@enron . com . 

chris . stokley@enron . com . 

tracy . ngo@enron . com . 


Name  p(z|u) 

Alan  Comnes .  0.000274 

Tim  Belden .  0.000274 

Pete  Davis .  0.000274 

.  0.000274 

.  0.000274 


Harry  Kingerski .  0.000274 

Bill  Williams  III...  0.000258 
Christopher  F.  Calge  0.000258 

Chris  H.  Foster .  0.000258 

Jeff  Richter .  0.000258 

Debra  Davidson .  0.000258 

Murray  P.  Neil .  0.000258 

John  Postlethwaite . .  0.000258 

Steve  Walton .  0.000258 

Dave  Perrino .  0.000258 

Chip  Schneider .  0.000258 

Greg  Wolfe .  0.000258 

.  0.000258 

Chris  Stokley .  0.000258 

.  0.000258 


CATEGORY  1 

EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  7078  COMPONENTS:  12 

LARGEST  COMPONENT  SIZE:  7051  PERCENT  OF  TOTAL  GRAPH:  99 . 62°/. 
GROUP  DEGREE:  0.04205  GRAPH  DENSITY:  0.00085 

GROUP  CLOSENESS:  0.00156  GROUP  BETWEENNESS:  0.07972 

AVERAGE  p(z | u) :  0.02  STDEVp(zlu):  0.03 
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MOST  PROBABLE  USERS 


Topic#  ID# 

Email  Address 

Name 

i 

8494 

trushar . patelOenron . com . 

.  Trushar  Patel . 

i 

25328 

tim.mckone@enron. com . 

i 

33839 

mgockerOenron . com . 

i 

29433 

legal . 4@enron . com . 

i 

83451 

kevin . d . j  ordan@enron . com . 

i 

34227 

robert . c . williamsOenron . com . 

.  robert . 

c . williams . . . 

i 

41550 

anne . c . koehler@enron . com . 

i 

1592 

Charles . delaceyOenron . com . 

i 

80963 

staci  holtzman@enron.com . 

i 

47812 

clong@enron . com . 

i 

41717 

brenda . 1 . f unk@enron . com . 

.  Brenda 

L . "  "Funk . . . . 

i 

15912 

russell . kelley@enron . com . 

i 

64778 

’ deberry@enron . com . 

i 

77375 

tillett@enron . com . 

i 

12277 

. gerald@enron . com . 

.  e-mail. 

i 

41565 

sstack@enron . com . 

i 

41517 

jkeller@enron . com . 

i 

41581 

bdavis@enron . com . 

i 

17749 

martin . smith@enron . com . 

.  Martin 

Smith . 

i 

48505 

kpurbhoo@enron . com . 

p(z|u) 

0.000936 

0.000857 

0.000831 

0.000808 

0.000757 

0.000701 

0.000695 

0.000647 

0.000628 

0.000625 

0.000623 

0.000578 

0.000563 

0.000551 

0.000531 

0.000524 

0.000507 

0.000445 

0.000439 

0.000439 


CATEGORY  2 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  5543  COMPONENTS:  8 

LARGEST  COMPONENT  SIZE:  5526  PERCENT  OF  TOTAL  GRAPH:  99.69'/, 
GROUP  DEGREE:  0.06472  GRAPH  DENSITY:  0.00108 

GROUP  CLOSENESS:  0.00353  GROUP  BETWEENNESS:  0.08963 

AVERAGE  p(z | u) :  0.03  STDEVp(zlu):  0.03 


MOST  PROBABLE  USERS 


Topic#  ID#  Email  Address  Name 

2  6815  debra.perlingiere8enron.com .  Debra  Perlingiere .  .  . 


2  18962  lynn.shiversSenron.com .  Lynn  Shivers.. 

2  15231  joanne.rozyckiSenron.com .  Joanne  Rozycki 

2  19052  diane.ellstromSenron.com . 

2  20224  andrea.guillenSenron.com . 

2  7131  bill.bowesSenron.com .  Bill  Bowes.... 

2  29429  majed.nachawatiSenron.com . 

2  533  gordon.heaneySenron.com .  Gordon  Heaney. 


p(z|u) 

0.000935 

0.000868 

0.000809 

0.000790 

0.000638 

0.000605 

0.000505 

0.000477 
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2  42425  allison.mchenry@enron.com . 

2  6673  celeste.cisneros@enron.com .  Celeste  Cisneros.... 

2  14756  r .  .  williams@enron .  com .  Jason  R.  Williams... 

2  81534  una.feeley@enron.com .  Una  Feeley . 

2  3440  esmeralda.gonzalez@enron.com .  Esmeralda  Gonzalez.. 

2  7476  andrew.ralston@enron.com .  Andrew  Ralston . 

2  301  pinto.leite@enron.com .  Francisco  Pinto  Leit 

2  81683  james.canney@enron.com .  James  Canney . 

2  14777  kay.young@enron.com . 

2  17737  carol.north@enron.com .  Carol  North . 

2  10246  steven.kleege@enron.com . 

2  18037  nidia.mendoza@enron.com .  Nidia  Mendoza . 


************************************************************************: 
CATEGORY  3 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  6912  COMPONENTS:  15 

LARGEST  COMPONENT  SIZE:  6870  PERCENT  OF  TOTAL  GRAPH:  99.39"/, 
GROUP  DEGREE:  0.08246  GRAPH  DENSITY:  0.00087 

GROUP  CLOSENESS:  0.00078  GROUP  BETWEENNESS:  0.11972 

AVERAGE  p(z I u) :  0.02  STDEVp(zlu):  0.02 


MOST  PROBABLE  USERS 


Topic#  ID# 
3  1175 

3  1488 

3  22319 

3  41760 

3  20117 

3  14566 

3  9320 

3  39760 

3  41075 

3  19486 

3  131 

3  22380 

3  464 

3  32858 

3  37128 

3  30857 

3  15555 

3  6070 

3  6936 


Email  Address 

arsystemSmailman . enron. com . 

perfmgmtSenron. com . 

perfmgmtSect . enron. com . 

matt . dawsonSenr on .com . 

arsystemSect . enron. com . 

approval . eol . gas . tradersSenron . com 
inf ormat ion. management® enron. com . . 

f letcher .  j  . sturm3enron.com . 

daemon . extraSenron .com . 

steve.beckSenron.com . 

maria. vanSenron. com . 

m . . hallSenron .com . 

sunil . abrahamSenron.com . 

arsystemSenron. com . 

dl-ga-pasSenron .com . 

erequestSenron. com . 

neal . d . winf reeSenron . com . 

scott .  lovingSenron.com . 

hakeem . ogunbunmiSenr on .com . 


Name 

ARSystem . 

"Performance  Evaluat 


"f letcher . j . sturmSen 
EXTRA  Mailer  Daemon. 


Maria  Van  houten. . . . 

Bob  M.  Hall . 

Sunil  Abraham . 


DL-GA-PAS 


Hakeem  Ogunbunmi . . . . 


0.000445 

0.000434 

0.000434 

0.000434 

0.000422 

0.000418 

0.000416 

0.000402 

0.000394 

0.000386 

0.000379 

0.000351 

****************************************** 


p(z|u) 

0.001519 

0.001516 

0.001382 

0.001118 

0.001083 

0.001019 

0.000775 

0.000767 

0.000709 

0.000629 

0.000448 

0.000419 

0.000389 

0.000377 

0.000371 

0.000365 

0.000312 

0.000306 

0.000291 
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3 


3006  robert.ramirez@enron.com 


Robert  Ramirez 


0.000288 


CATEGORY  4 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  6247  COMPONENTS:  8 

LARGEST  COMPONENT  SIZE:  6224  PERCENT  OF  TOTAL  GRAPH:  99.63'/, 
GROUP  DEGREE:  0.04842  GRAPH  DENSITY:  0.00080 

GROUP  CLOSENESS:  0.00206  GROUP  BETWEENNESS:  0.07966 

AVERAGE  p(z I u) :  0.02  STDEVp(zlu):  0.01 


MOST  PROBABLE  USERS 


Topic#  ID# 
4  8773 

4  742 

4  707 

4  8845 

4  14652 

4  1153 

4  6933 

4  5123 

4  18956 

4  22344 

4  3643 

4  24363 

4  21151 

4  38243 

4  12171 

4  30688 

4  15949 

4  14710 

4  15234 

4  19898 


Email  Address  Name 

michelle.nelson@enron.com .  Michelle  Nelson . 

amanda . rybar ski@enron .com .  Amanda  Rybar ski . 

mike.maggi@enron.com .  Mike  Maggi . 

sam.leuschen@enron.com .  Sam  Leuschen . 

amanda.huble@enron.com .  Amanda  Huble . 

e .  .  kelly@enron.  com .  Mike  E.  Kelly . 

gabriel.monroy@enron.com .  Gabriel  Monroy . 

margaret . allen@enron . com . 

cecilia.rodriguez@enron.com .  Cecilia  Rodriguez... 

becky ,pitre@enron . com . 

alexandra.villarreal@enron.com .  Alexandra  Villarreal 

roberts@enron . com . 

steve.bigalow@enron.com .  Steve  Bigalow . 

jr ,martinez@enron .com . 

.sheila@enron.com .  e-mail . 

donna . dye@enron .com . 

alexandra . salerSenron .com . 


james.barker@enron.com .  James  Barker. 

brent.dornier@enron.com .  Brent  Dornier 


cor ey . hollander@enron . com 


p(z|u) 

0.001505 

0.001426 

0.001151 

0.000394 

0.000387 

0.000383 

0 . 000348 

0.000319 

0.000270 

0.000261 

0.000240 

0.000223 

0.000209 

0.000206 

0.000199 

0.000196 

0.000191 

0.000189 

0.000189 

0.000186 


CATEGORY  5 

EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  5864  COMPONENTS:  23 

LARGEST  COMPONENT  SIZE:  5792  PERCENT  OF  TOTAL  GRAPH:  98.77'/, 
GROUP  DEGREE:  0.06912  GRAPH  DENSITY:  0.00085 

GROUP  CLOSENESS:  0.00039  GROUP  BETWEENNESS:  0.12967 

AVERAGE  p(z | u) :  0.02  STDEVp(zlu):  0.02 
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MOST  PROBABLE  USERS 


Topic#  ID# 
5  14993 

5  2985 

5  3652 

5  1187 

5  18319 

5  18318 

5  18326 

5  1502 

5  28467 

5  9987 

5  28781 

5  589 

5  38978 

5  6156 

5  5539 

5  409 

5  4636 

5  26756 

5  16332 

5  76892 


Email  Address 

eserverOenron . com . 

enron . payrollOenron . com . 

payroll . enronOenron . com . 

conf irmit@enron . com . 

tahnee . stall@enron . com . 

tammy . marcontellOenron . com . 

mbx_iscinf ra@enron . com . 

icOenron. com . 

resources@enron . com . 

jderricOenron . com . 

enronanywhereOenron . com . 

j  ennif er . rosadoOenron . com . 

communications . internal@enron. com 

j . harris@enron . com . 

harora@enron . com . 

bwillia5@enron . com . 

expense . report@enron . com . 

talentOenron . com . 

communityrelationsOenron . com . 

team . Oakland® enron . com . 


Name 

eserver@enron . com@EN 
"Enron . Payroll@enron 

Enron  Payroll . 

Conf irmit . 


"ic@enron.com" . 

"human  resources@enr 


" enr onanywhere@enr on 

Jennifer  Rosado . 

Internal  Communicati 


William  Williams. . . . 


"talent@enron.com" . . 
"communityrelations® 
Team  Oakland . 


CATEGORY  6 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  7767  COMPONENTS:  21 

LARGEST  COMPONENT  SIZE:  7689  PERCENT  OF  TOTAL  GRAPH:  99.007, 
GROUP  DEGREE:  0.07389  GRAPH  DENSITY:  0.00077 

GROUP  CLOSENESS:  0.00032  GROUP  BETWEENNESS:  0.10976 

AVERAGE  p(z | u) :  0.02  STDEVp(zlu):  0.01 


MOST  PROBABLE  USERS 


Topic#  ID# 
6  8780 
6  20045 
6  8594 
6  16550 
6  983 
6  37125 
6  58211 
6  22827 


Email  Address 

kensey_subscriber@mailman . enron . com 

restricted . list@enron . com . 

kkeiser@enron . com . 

clear head@mailman . enron . com . 

westdesksupport@enron . com . 

communications@enron . com . 

massage . therapy@enron . com . 

crodrig@ect . enron . com . 


Name 


clearhead@mailman . en 


Communications 


p(z|u) 

0.001461 

0.001450 

0.001429 

0.001350 

0.001239 

0.001230 

0.001111 

0.001071 

0.001062 

0.000751 

0.000751 

0.000595 

0.000520 

0.000508 

0.000484 

0.000474 

0.000469 

0.000465 

0.000458 

0.000437 


p(z|u) 

0.001523 

0.001002 

0.000723 

0 . 000448 

0.000311 

0.000268 

0.000227 

0.000225 
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6 

6 

6 

6 

6 

6 

6 

6 

6 

6 

6 

6 


25054  enron.gss@enron.com .  Enron  GSS . 

60043  critical.notice@enron.com . 

34474  the.daytrader@enron.com . 

35531  renee.ratcliff@enron.com .  Renee  Ratcliff . 

13263  robert.gerry@enron.com .  Robert  Gerry . 

57767  charles.okechukwu@enron.com .  Charles  Okechukwu.  .  . 

57370  sshackl@ect.enron.com .  "Sara  Shackelton  ".. 

10231  mason.hamlin@enron.com . 

5737  steven.bailey@enron.com . 

62459  make . money@mailman . enron .com . 

15719  liz.hillman@enron.com . 

7732  gerosimo@enron.com . 


0.000181 

0.000181 

0.000180 

0.000180 

0.000173 

0.000155 

0.000144 

0.000143 

0.000142 

0.000138 

0.000136 

0.000129 


******************************************************************************************************************* 
CATEGORY  7 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  8984  COMPONENTS:  20 

LARGEST  COMPONENT  SIZE:  8926  PERCENT  OF  TOTAL  GRAPH:  99.35"/, 
GROUP  DEGREE:  0.09017  GRAPH  DENSITY:  0.00078 

GROUP  CLOSENESS:  0.00045  GROUP  BETWEENNESS:  0.12980 

AVERAGE  p(z | u) :  0.02  STDEVp(zlu):  0.02 


MOST  PROBABLE  USERS 


Topic#  ID# 
7  2883 

7  6242 

7  2590 

7  5582 

7  61532 

7  6227 

7  12651 

7  8630 

7  5679 

7  20366 

7  53565 

7  65073 

7  11117 

7  8645 

7  43107 

7  1078 

7  54190 

7  19793 

7  412 


Email  Address  Name 

all . houstonSenron . com . 

all.downtown@enron.com .  All  Enron  Downtown.. 

lauren.schlesinger@enron.com .  Lauren  Schlesinger .  . 

body . shop@enron .com . 

runners@enron .com . 

enron . action@enron. com . 


susan.poole@enron.com .  Susan  Poole 

donna.teal@enron.com .  Donna  Teal. 

Stan. horton@enron . com . 


enron .  houstonSenron .  com . 

wanda . chalk@enr on .com . 

public .houstonSenron . com . 

jennifer.pattison@enron.com .  Jennifer  Pattison.  .  . 

dl-ga-all_enron_houston@enron .com .  DL-GA-all_enr on_hous 

unspecif ied-recipients@enron . com .  unspecif ied-recipien 

40enron@enron.com .  Tracey  Ramsey  -  Glob 

gas . houstonSenron . com . 

char la . stuartSenron . com . 

no . address@enron.com . 


p(z|u) 

0.001278 

0.001040 

0.000904 

0 . 000847 

0.000794 

0.000703 

0.000542 

0.000520 

0.000492 

0.000490 

0.000470 

0.000470 

0.000465 

0 . 000446 

0.000437 

0.000427 

0.000421 

0.000410 

0.000400 
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7 


9372  jeffrey.mcclellan9enron.com 


0.000382 


CATEGORY  8 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  8795  COMPONENTS:  19 

LARGEST  COMPONENT  SIZE:  8748  PERCENT  OF  TOTAL  GRAPH:  99.47'/, 
GROUP  DEGREE:  0.06955  GRAPH  DENSITY:  0.00080 

GROUP  CLOSENESS:  0.00060  GROUP  BETWEENNESS:  0.07979 

AVERAGE  p(z I u) :  0.02  STDEVp(zlu):  0.02 


MOST  PROBABLE  USERS 


Topic#  ID# 
8  18047 

8  2625 

8  20280 

8  95 

8  19475 

8  211 

8  165 

8  4981 

8  1843 

8  15291 

8  24080 

8  18046 

8  30153 

8  1827 

8  776 

8  1183 

8  29402 

8  127 

8  17268 

8  5380 


Email  Address 

project ,team9enron. com . 

jose . f avelaQenron . com . 

ernie9enr  on  .com . 

center . dl-portland9enron . com 

grade  .presas9enron.  com . 

portland.desk9enron.com . 

desk.portland9enron.com . 

magdelena.cruz9enron.com. . . . 
andrea.richards9enron.com. . . 

axisteam9enron. com . 

registrar . isc9enron.com . 

lee . steele9enron.com . 

investinme9enron. com . 

Constance . charles@enron . com . 
kevin . whitehurst9enr on . com .  . 

lisa. jones9enron. com . 

project . gem0enron .com . 

josie.jarnagin0enron.com. . . . 

isc . registrar9enron.com . 

cheryl . kuehl9enron .com . 


Name  p(z|u) 

.  0.001179 

.  0.001013 

.  0.000905 

DL-Portland  World  Tr  0.000720 
.  0.000640 


Portland  West  Desk. .  0.000637 
Portland  West  Desk. .  0.000632 

.  0.000619 

Andrea  Richards .  0.000564 

"The  Associate  and  A  0.000536 
ISC  Registrar .  0.000535 


0.000525 

0.000485 


Constance  Charles...  0.000474 
Kevin  Whitehurst....  0.000460 

Lisa  Jones .  0.000459 

.  0.000438 

.  0.000429 

.  0.000426 

.  0.000413 


CATEGORY  9 

EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  9817  COMPONENTS:  16 

LARGEST  COMPONENT  SIZE:  9776  PERCENT  OF  TOTAL  GRAPH:  99.58'/, 
GROUP  DEGREE:  0.08249  GRAPH  DENSITY:  0.00071 

GROUP  CLOSENESS:  0.00075  GROUP  BETWEENNESS:  0.13983 

AVERAGE  p(z | u) :  0.02  STDEVp(zlu):  0.02 
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MOST  PROBABLE  USERS 


Topic#  ID#  Email  Address 


Name 


p(z|u) 


9 

9 

9 

9 

9 

9 

9 

9 

9 

9 

9 

9 

9 

9 

9 

9 

9 

9 

9 

9 


5600 

2368 

1603 

1176 

34025 

86 

5381 

65258 

34043 

29742 

43993 

54281 

4088 

6320 

19727 

5620 

44158 

12527 

51410 

1894 


michael.horningOenron.com.  . 

coo .jeff@enron.com . 

anthony . duenner@enron . com . . 

ethinkSenron .com . 

mitch.meyer@enron . com . 

all.worldwide@enron.com.  .  .  . 
matthew . scr imshaw@enron . com 

nate . ellis@enron.com . 

mariano.gomez@enron.com. . . . 

mcar son@enron .com . 

norm . ruiz@enron .com . 

new . jun-sept@enron. com . 

dorothy.dalton@enron.com. . . 

gail .  whipple@enron.  com . 

john.tollefsen@enron.com. . . 

curly.baca@enron.com . 

mike . teal@enron .com . 

r odney .  derbigny@enron .  com .  . 

rmukher  j  @ees  .  enron  .com . 

chairman.enron@enron.com. . . 


.  0.000549 

Jeff  McMahon  -  Presi  0.000511 


ethink 


All  Enron  Worldwide . 


Mariano  Gomez 


Rodney  Derbigny 


Enron  Office  Of  The 


0.000493 
0.000489 
0.000470 
0.000454 
0.000442 
0.000425 
0.000405 
0.000403 
0.000385 
0.000384 
0.000378 
0.000376 
0.000360 
0 . 000346 
0.000345 
0.000338 
0.000335 
0.000332 


CATEGORY  10 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  7170  COMPONENTS:  15 

LARGEST  COMPONENT  SIZE:  7107  PERCENT  OF  TOTAL  GRAPH:  99. 12% 
GROUP  DEGREE:  0.12595  GRAPH  DENSITY:  0.00098 

GROUP  CLOSENESS:  0.00047  GROUP  BETWEENNESS:  0.18976 

AVERAGE  p(z | u) :  0.02  STDEVp(zlu):  0.02 


MOST  PROBABLE  USERS 


Topic#  ID# 
10  6683 
10  6752 
10  718 
10  311 
10  694 
10  360 
10  6579 
10  730 


Email  Address 
carole.frank@enron.com. . . . 
chance.rabon@enron.com. . . . 

bruce  ,mills@enron .  com . 

hal.mckinney@enron.com. . . . 

brad.jones@enron.com . 

wayne.vinson@enron.com.  .  .  . 
andres.balmaceda@enron.com 
scott.palmer@enron.com. . . . 


Name  p(z|u) 

Carole  Frank .  0.001058 

Chance  Rabon .  0.000945 

Bruce  Mills .  0.000893 

Hal  McKinney .  0.000845 

Brad  Jones .  0.000845 

Donald  Wayne  Vinson.  0.000757 
Andres  Balmaceda. . . .  0.000687 
B.  Scott  Palmer .  0.000665 


204 


10 

10 

10 

10 

10 

10 

10 

10 

10 

10 

10 

10 


500  sherry.dawson@enron.com .  Sherry  Dawson. 

594  amanda.schultz@enron.com .  Amanda  Schultz 

274  sanjeev.gupta@enron.com .  Sanjeev  Gupta. 

8763  tiffany.miller@enron.com .  Tiffany  Miller 


23681  brian.kristjansen@enron.com 


20242  delma.salazar@enron.com . 

649  randy.bhatia@enron.com .  Randy  Bhatia.  . 

754  cathy.sprowls@enron.com .  Cathy  Sprowls. 

3025  shifali.sharma@enron.com .  Shifali  Sharma 

3019  anne.bike@enron.com .  Anne  Bike . 

452  kathy.reeves@enron.com .  Kathy  Reeves.. 

14656  lee.fascetti@enron.com . 


0.000638 

0.000637 

0.000615 

0.000546 

0.000521 

0.000511 

0.000509 

0.000505 

0.000475 

0.000467 

0.000408 

0.000397 


******************************************************************************************************************* 
CATEGORY  11 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  7614  COMPONENTS:  20 

LARGEST  COMPONENT  SIZE:  7556  PERCENT  OF  TOTAL  GRAPH:  99 . 24°/. 
GROUP  DEGREE:  0.07235  GRAPH  DENSITY:  0.00092 

GROUP  CLOSENESS:  0.00046  GROUP  BETWEENNESS:  0.12978 

AVERAGE  p(z | u) :  0.02  STDEVp(zlu):  0.01 


MOST  PROBABLE  USERS 


Topic#  ID# 

ii 

34417 

ii 

29798 

ii 

8694 

ii 

62034 

ii 

29847 

ii 

16172 

ii 

20449 

ii 

6221 

ii 

2548 

ii 

34375 

ii 

34732 

ii 

6166 

ii 

21296 

ii 

82051 

ii 

68426 

ii 

41521 

ii 

19928 

ii 

36645 

ii 

57370 

Email  Address 

alewisOect . enron . com . 

investorOmailboy . enron . com . 

investorOmailman . enron . com . 

jwilliaOenron . com . 

hot39d@mailman . enron . com . 

dbaughmOect . enron . com . 

press . releaseOenron . com . 

larimore@enron . com . 

f . . keaveyOenron . com . 

alewisOenron . com . 

andrew . h . lewis@enron . com . 

glenn . dubinOenron . com . 

money . in . mot ionOmailman . enron . com 

3 applingOenron . com . 

bmckayOect . enron .com . 

brapp@enron . com . 

pallen@enron . com . 

snealOei . enron . com . 

sshackl@ect . enron . com . 


Name 

Andrew  Lewis 


Peter  F.  Keavey 


andrew. h. lewis 


"Sara  Shackelton 


p(z|u) 

0.000879 

0.000612 

0.000478 

0.000469 

0.000454 

0 . 000446 

0.000398 

0.000391 

0.000383 

0.000365 

0.000336 

0.000320 

0.000313 

0.000298 

0.000275 

0.000257 

0.000254 

0.000251 

0.000220 
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11 


1999  jaime.gualy@enron.com 


Jaime  Gualy 


0.000219 


CATEGORY  12 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  7712  COMPONENTS:  20 

LARGEST  COMPONENT  SIZE:  7656  PERCENT  OF  TOTAL  GRAPH:  99 . 27°/. 
GROUP  DEGREE:  0.06178  GRAPH  DENSITY:  0.00078 

GROUP  CLOSENESS:  0.00052  GROUP  BETWEENNESS:  0.07976 

AVERAGE  p(z | u) :  0.02  STDEVp(zlu):  0.01 


MOST  PROBABLE  USERS 


Topic#  ID# 

12 

3053 

12 

14663 

12 

14664 

12 

30596 

12 

512 

12 

41351 

12 

3113 

12 

532 

12 

41717 

12 

288 

12 

6673 

12 

10238 

12 

15914 

12 

81667 

12 

1396 

12 

8409 

12 

37136 

12 

46280 

12 

17096 

12 

8953 

Email  Address 

j  ohn . wilson@enron . com . 

bill . briggs@enron . com . 

phil . cliff ord@enron . com . 

sarah . wesner@enron . com . 

j  ason . f ischer@enron . com . 

nymex . list@enron . com . 

sara.shackleton@enron.com. . . 

reginald . hart@enron . com . 

brenda . 1 . f unk@enron . com . 

tana . j  ones@enron .com . 

celeste . cisneros@enron . com . . 

mary . ruf f er@enron . com . 

kim . stanley@enron . com . 

brewer@enron . com . 

vicsandra . truj illo@enron . com 
trevor.randolph@enron.com. . . 

counsel . dave@enron . com . 

j  ohn . west@enron . com . 

kimberly.allen@enron.com. . . . 
j  ason . moore@enron . com . 


Name  p(z|u) 

John  Wilson .  0.000774 

.  0.000725 

.  0.000722 

.  0.000432 

Jason  Fischer .  0.000336 

.  0.000324 

.  0.000323 


Reginald  Hart .  0.000316 

Brenda  L."  "Funk....  0.000311 

Tana  Jones .  0.000293 

Celeste  Cisneros....  0.000293 


0.000286 

0.000265 

0.000249 


Vicsandra  Trujillo..  0.000247 

Trevor  Randolph .  0.000246 

Assistant  General  Co  0.000227 

.  0.000207 

.  0.000199 

Jason  Moore .  0.000193 


CATEGORY  13 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  7018  COMPONENTS:  18 

LARGEST  COMPONENT  SIZE:  6973  PERCENT  OF  TOTAL  GRAPH:  99. 36°/. 
GROUP  DEGREE:  0.11584  GRAPH  DENSITY:  0.00086 

GROUP  CLOSENESS:  0.00073  GROUP  BETWEENNESS:  0.19976 

AVERAGE  p(z | u) :  0.02  STDEVp(zlu):  0.01 
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MOST  PROBABLE  USERS 


Topic#  ID# 

13 

3973 

13 

3780 

13 

4987 

13 

1285 

13 

16110 

13 

31730 

13 

35216 

13 

35359 

13 

3357 

13 

29682 

13 

25042 

13 

60310 

13 

35004 

13 

81364 

13 

24890 

13 

85760 

13 

40561 

13 

3781 

13 

4116 

13 

46388 

Email  Address 

admin . enronOenron . com . 

enron . mailsweeper . admin@enron . com . 

billy . dorseyOenron . com . 

suzanne . danzOenron . com . 

dbaughmOnotes . enron . com . 

enron.messaging.administration@enron.com 

greg . gonzales@enron . com . 

mmaggi@notes . enron . com . 

adam . senn@enron . com . 

vkamins@notes . enron . com . 

victoria . wilbeck@enron . com . 

mlenhart@notes . enron . com . 

plove@notes . enron . com . 

rsander@notes . enron . com . 

melody . gray@enron . com . 

mhain@ect . enron . com . 

swhite@notes . enron . com . 

crandall@notes . enron . com . 

katherine . brown@enron . com . 

ed.cattigan@enron.com . 


Name  p(z|u) 

Enron  MailSweeper  Ad  0.001527 

.  0.001500 

.  0.001323 

Suzanne  Danz .  0.001204 

.  0.001154 

.  0.000990 

Greg  Gonzales .  0.000774 

.  0.000761 

Adam  Senn .  0.000735 

.  0.000714 

.  0.000696 

.  0.000665 

.  0.000646 

.  0.000563 

.  0.000551 

" .  0.000534 

.  0.000482 

.  0.000480 

.  0.000475 

Ed  Cattigan .  0.000448 


CATEGORY  14 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  7882  COMPONENTS:  20 

LARGEST  COMPONENT  SIZE:  7831  PERCENT  OF  TOTAL  GRAPH:  99.357, 
GROUP  DEGREE:  0.05171  GRAPH  DENSITY:  0.00076 

GROUP  CLOSENESS:  0.00055  GROUP  BETWEENNESS:  0.06977 

AVERAGE  p(z | u) :  0.02  STDEVp(zlu):  0.01 


MOST  PROBABLE  USERS 


Topic#  ID#  Email  Address 


14 

14 

14 

14 

14 

14 

14 

14 


60189  pkeavey@ect.enron.com 
81466  mheard@ect.enron.com. 
24256  cgerman@ect.enron.com 
77175  esager2@ect.enron.com 
15669  sscott5@enron.com.... 
60973  kruscit@ect.enron.com 
37552  pplatte@ect.enron.com 
71045  mcuilla@ect.enron.com 


Name 


Kevin 


Martin  Cuilla 


p(z|u) 

0.000846 

0.000810 

0.000788 

0.000702 

0.000682 

0.000611 

0.000604 

0.000545 
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14 


0.000482 


14 

14 

14 

14 

14 

14 

14 

14 

14 

14 

14 


71324  tkuyken@ect.enron.com . 

83277  mtaylol@ect.enron.com . 

25054  enron.gss@enron.com .  Enron  GSS . 

73030  jshankm@ect.enron.com .  Wharton  Alumni. 

78508  kholst@enron.com . 

31824  joseph.lippeatt@enron.com .  Joseph  Lippeatt 

78538  kholst@ect.enron.com . 


24019  backroads .travel .updateSmailman. enron . co 

41627  scorman@enron.com . 

71848  ddavis@ect.enron.com . 

4310  ebass@enron.com . 

79405  gblair@enron.com . 


0.000458 

0.000434 

0.000434 

0.000421 

0.000416 

0.000378 

0.000367 

0.000367 

0.000365 

0.000355 

0.000330 


******************************************************************************************************************* 
CATEGORY  15 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  5448  COMPONENTS:  18 

LARGEST  COMPONENT  SIZE:  5411  PERCENT  OF  TOTAL  GRAPH:  99. 32°/. 
GROUP  DEGREE:  0.04648  GRAPH  DENSITY:  0.00092 

GROUP  CLOSENESS:  0.00364  GROUP  BETWEENNESS:  0.37696 

AVERAGE  p(z | u) :  0.02  STDEVp(zlu):  0.01 


MOST  PROBABLE  USERS 


Topic#  ID# 

15 

3430 

15 

1181 

15 

29154 

15 

23209 

15 

68893 

15 

24888 

15 

30984 

15 

21066 

15 

35804 

15 

20489 

15 

38713 

15 

37125 

15 

15319 

15 

6075 

15 

8409 

15 

3000 

15 

7658 

15 

39836 

15 

19970 

Email  Address  Name 

system . administrator@enron . com . 

exchange . administratorOenron . com . 

postmaster@enron .com . 

ect . adminOenron . com . 

idrc . houston . chapter@mailman . enron . com . 

david . glessnerOenron . com . 

lopezOenron . com . 

rumaldo.lopez@enron.com .  Rumaldo  Lopez 

everyone_in_ect_calgary@enron . com . 


crandal . hardy@enron . com . 

sumey@enron . com . 

communications@enron .  com .  Communications  . 

sharon . peace@enron . com . 

debra . young@enron . com . 

trevor.randolph@enron.com .  Trevor  Randolph 

tammy.gilmore@enron.com .  Tammy  Gilmore.. 

jennifer.oliver@enron.com .  Jennifer  Oliver 


ibuyit . approvers@enron . com 
rob . bakondy@enron . com . 


p(z|u) 

0.001450 

0.001211 

0.001062 

0.000369 

0.000261 

0.000258 

0.000247 

0.000210 

0.000201 

0.000179 

0.000174 

0.000172 

0.000170 

0.000167 

0.000162 

0.000161 

0.000159 

0.000153 

0.000148 
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15  14908  rajesh.chettiar@enron.com 


0.000142 


CATEGORY  16 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  7500  COMPONENTS:  25 

LARGEST  COMPONENT  SIZE:  7429  PERCENT  OF  TOTAL  GRAPH:  99.05'/, 
GROUP  DEGREE:  0.10801  GRAPH  DENSITY:  0.00093 

GROUP  CLOSENESS:  0.00036  GROUP  BETWEENNESS:  0.16978 

AVERAGE  p(z I u) :  0.02  STDEVp(zlu):  0.01 


MOST  PROBABLE  USERS 


Topic#  ID# 

16 

1105 

16 

459 

16 

23333 

16 

15399 

16 

71298 

16 

34375 

16 

56955 

16 

72491 

16 

40227 

16 

11170 

16 

23675 

16 

11392 

16 

52383 

16 

11391 

16 

42065 

16 

45145 

16 

15648 

16 

15007 

16 

20345 

16 

11176 

Email  Address 

j  .  .  broderick@enron .  com . 

garrett . tr ippOenron . com . 

darrell . schoolcraft@enron.com 

tmart in@enron .com . 

rgay@enron .com . 

alewis@enron .  com . 

alexandre.bueno@enron.com.  .  .  . 

jtholt@enron. com . 

maurice.gilbert@enron.com. . . . 

gabr  iel .  chavezSenr  on  .com . 

roger . westf allSenron . com . 

j  ohn .  millar@enr  on  .com . 

jeff_dasovich@ees.enron.com. . 

j  ames . bryj  a@enr on .com . 

bill.mangels@enron.com . 

j  st  ef  f  e@enr on .com . 

farzad.farhangnia@enron.com. . 

security . sap@enron. com . 

gas . operations@enron . com . 

milagr os . daetz@enr on .com . 


Name  p(z|u) 

Paul  J.  Broderick...  0.001183 
Garrett  Tripp .  0.000728 


.  0.000672 

.  0.000595 

.  0.000575 

.  0.000391 

.  0.000352 

.  0.000348 

Maurice  Gilbert .  0.000311 

.  0.000293 

.  0.000278 

.  0.000270 

.  0.000258 

.  0.000249 

.  0.000249 

.  0.000244 

.  0.000242 

SAP  Security .  0.000231 

.  0.000226 

.  0.000210 


CATEGORY  17 

EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  8409  COMPONENTS:  16 

LARGEST  COMPONENT  SIZE:  8374  PERCENT  OF  TOTAL  GRAPH:  99.58'/, 
GROUP  DEGREE:  0.10491  GRAPH  DENSITY:  0.00083 

GROUP  CLOSENESS:  0.00100  GROUP  BETWEENNESS:  0.17981 

AVERAGE  p(z | u) :  0.02  STDEVp(zlu):  0.02 
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MOST  PROBABLE  USERS 


Topic#  ID# 

17 

3015 

17 

11297 

17 

18956 

17 

11322 

17 

7076 

17 

5645 

17 

6933 

17 

29138 

17 

14878 

17 

3617 

17 

36661 

17 

2197 

17 

34254 

17 

2592 

17 

80350 

17 

35941 

17 

14744 

17 

3643 

17 

9446 

17 

26591 

Email  Address  Name 

ted.evans@enron.com .  Ted  Evans . 

j  ason . j  ennaro@enron . com . 

cecilia.rodriguez@enron.com .  Cecilia  Rodriguez... 

elizabeth . peters@enron . com . 

zachary.mccarroll@enron.com .  Zachary  McCarroll... 

wilson . kriegel@enron . com . 

gabriel.monroy@enron.com .  Gabriel  Monroy . 

douglas . nichols@enron . com . 

li . sun@enron . com . 

chris.cramer@enron.com .  Chris  Cramer . 

donnis . traylor@enron . com . 

ted.noble@enron.com .  Ted  Noble . 

donald . miller@enron . com . 

m.  .  scott@enron.  com .  Susan  M.  Scott . 

3 nielsen@enron . com . 

jmckay2@ect.enron.com .  Jon  McKay . 

zarin.imam@enron.com .  Zarin  Imam . 

alexandra.villarreal@enron.com .  Alexandra  Villarreal 

susan . weison@enron . com . 

ora.cross@enron.com .  Ora  Cross . 


CATEGORY  18 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  7325  COMPONENTS:  18 

LARGEST  COMPONENT  SIZE:  7282  PERCENT  OF  TOTAL  GRAPH:  99.417, 
GROUP  DEGREE:  0.06819  GRAPH  DENSITY:  0.00082 

GROUP  CLOSENESS:  0.00074  GROUP  BETWEENNESS:  0.12972 

AVERAGE  p(z | u) :  0.02  STDEVp(zlu):  0.03 


MOST  PROBABLE  USERS 


Topic#  ID# 
18  63604 
18  63612 
18  63315 
18  63316 
18  3000 
18  76971 
18  44089 
18  53773 


Email  Address 

lhs-gas . kvammen@enron . com . 

lhc-gas . kvammen@enron . com . 

gas . lhc@enron . com . 

hfs.reite@enron.com . 

tammy . gilmore@enron . com . 

dl-etsgascontrollers@enron.com 
ranelle.paladino@enron.com. . . . 
controllers.dl-ets@enron.com. . 


Name 

Kjell  -  LHS-GAS  Kvam 
Kjell  -  LHC-GAS  Kvam 

LHC  GAS . 

NILS  -  B.  Superinten 

Tammy  Gilmore . 

DL-ETS  Gas  Controlle 


DL-ETS  Gas  Controlle 


p(z|u) 

0.001210 

0.000787 

0.000775 

0.000729 

0.000569 

0.000535 

0.000530 

0.000527 

0.000508 

0.000505 

0.000505 

0.000486 

0.000484 

0.000481 

0.000481 

0.000479 

0.000476 

0.000469 

0.000469 

0.000469 


p(z|u) 

0.001045 

0.001022 

0.000690 

0.000676 

0.000587 

0.000583 

0.000556 

0.000555 


210 


18 

18 

18 

18 

18 

18 

18 

18 

18 

18 

18 

18 


19807  alma.carrillo@enron.com.. 

7005  jane.joyce@enron.com . 

53722  angela.white@enron.com... 

54051  jim.fernie@enron.com . 

54135  bullets@enron.com . 

44209  jan.moore@enron.com . 

76960  pipeline.team@enron.com.. 
15299  theresa.branney@enron.com 
24804  kelly.allen@enron.com. . . . 

19826  kim.perez@enron.com . 

53749  v.dickerson@enron.com.  .  .  . 
24885  ava.garcia@enron.com . 


_  0.000513 

Jane  Joyce . 

_  0.000437 

Angela  White .... 

_  0.000430 

Jim  Fernie . 

_  0.000377 

_  0.000367 

_  0.000366 

Team  Pampa  Pipeline, 0.000357 

.  0.000336 

.  0.000330 

.  0.000325 

Steve  V  Dickerson...  0.000325 
.  0.000318 


******************************************************************************************************************* 
CATEGORY  19 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  7159  COMPONENTS:  8 

LARGEST  COMPONENT  SIZE:  7145  PERCENT  OF  TOTAL  GRAPH:  99. 80°/. 
GROUP  DEGREE:  0.05925  GRAPH  DENSITY:  0.00084 

GROUP  CLOSENESS:  0.00460  GROUP  BETWEENNESS:  0.10972 

AVERAGE  p(z | u) :  0.02  STDEVp(zlu):  0.01 


MOST  PROBABLE  USERS 


Topic#  ID# 

19 

7667 

19 

11479 

19 

3602 

19 

6520 

19 

73730 

19 

21165 

19 

2629 

19 

1115 

19 

8342 

19 

17268 

19 

29191 

19 

30857 

19 

39836 

19 

15060 

19 

14905 

19 

79464 

19 

15556 

19 

822 

19 

17913 

Email  Address  Name 

ipayitOenron .  com .  iPayit@Enron .  com>@EN 

ibuyit .  payablesOenron .  com .  iBuyit .  Payable sOEnro 

robert . j  onesOmailman . enron . com . 

payables.ibuyit@enron.com .  iBuyit  Payables . 

mariachi.el@enron.com .  El  Mariachi . 

carolyn.graham@enron.com .  Carolyn  Graham . 

bbutler2@enron . com . 

clint.dean@enron.com .  Clint  Dean . 

ibuyit@enron . com . 

isc.registrar@enron.com . 

vanessa.griffin@enron.com .  Vanessa  Griffin . 

erequest@enron . com . 

ibuyit . approvers@enron . com . 

ashu.tewari@enron.com .  Ashu  Tewari . 

bradley . stewart@enron . com . 

quickplace@nahou-lnww01.ots.enron.com. . .  "customerservice" . . . 

sap . coe@enron . com . 

portland.dl-ubsw@enron.com .  DL-UBSW  Energy  Portl 

deborah.heath@enron.com .  Deborah  Heath . 


p(z|u) 

0.001524 

0.001384 

0.001375 

0.001243 

0.001086 

0.000801 

0.000701 

0.000438 

0.000382 

0.000242 

0.000175 

0.000168 

0.000167 

0.000163 

0.000147 

0.000145 

0.000142 

0.000132 

0.000127 


211 


19  27028  john.chambersSenron.com 


0.000127 


CATEGORY  20 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  7086  COMPONENTS:  11 

LARGEST  COMPONENT  SIZE:  7065  PERCENT  OF  TOTAL  GRAPH:  99.70'/, 
GROUP  DEGREE:  0.12786  GRAPH  DENSITY:  0.00085 

GROUP  CLOSENESS:  0.00285  GROUP  BETWEENNESS:  0.23978 

AVERAGE  p(z I u) :  0.02  STDEVp(zlu):  0.01 


MOST  PROBABLE  USERS 


Topic#  ID# 

20 

4320 

20 

20271 

20 

36085 

20 

19300 

20 

19088 

20 

227 

20 

7163 

20 

1844 

20 

63664 

20 

15536 

20 

18400 

20 

19258 

20 

20456 

20 

7005 

20 

14726 

20 

56873 

20 

23554 

20 

6879 

20 

18607 

20 

19116 

Email  Address  Name 

erwin . landivarSenron . com . 

north . americaSenron . com . 

north . america_europeSenron . com . 

coralie . evansSenron. com . 

neil.tarlingSenron.com .  Neil  Tarling. 

sally.beckSenron.com .  Sally  Beck... 

holly.heathSenron.com .  Holly  Heath.. 

corina.taylorSenron.com .  Corina  Taylor 

enw . allSenron .com . 

jay.smithSenron.com .  Jay  Smith.... 

bill .  gulyassySenr on .  com . 

egm . employeesSenron . com . 

peter . ghavamiSenron. com . 


jane.joyceSenron.com .  Jane  Joyce. 

george.hopeSenron.com .  George  Hope 


elizabeth . serralheiroSenron . com . 

enw . piperSenron .com . 

geynille.dillinghamSenron.com .  Geynille  Dillingham. 

sally_beckSenron.com . 

sbeck2Senron .com . 


p(z|u) 

0.000409 

0.000383 

0.000358 

0.000332 

0.000323 

0.000314 

0.000295 

0.000277 

0.000271 

0.000260 

0.000252 

0.000246 

0.000242 

0.000239 

0.000227 

0.000226 

0.000208 

0.000204 

0.000198 

0.000182 


CATEGORY  21 

EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  6667  COMPONENTS:  13 

LARGEST  COMPONENT  SIZE:  6617  PERCENT  OF  TOTAL  GRAPH:  99.25'/, 
GROUP  DEGREE:  0.08622  GRAPH  DENSITY:  0.00090 

GROUP  CLOSENESS:  0.00058  GROUP  BETWEENNESS:  0.12973 

AVERAGE  p(z | u) :  0.02  STDEVp(zlu):  0.01 
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MOST  PROBABLE  USERS 


Topic#  ID# 

21 

1145 

21 

84535 

21 

54044 

21 

6324 

21 

30924 

21 

76603 

21 

4182 

21 

34474 

21 

607 

21 

21345 

21 

69527 

21 

41601 

21 

85 

21 

6859 

21 

15719 

21 

75988 

21 

37501 

21 

37149 

21 

19371 

21 

2980 

Email  Address  Name 

benjcimin.rogers@eiiroii.com .  Benjamin  Rogers 

rring@ees.enron.com .  NYISO  TIE  List. 

5 ball@enron . com . 

mauboussin@enron . com . 

maria. garcia@enron. com . 


ben j  amin .  rogers@ect .  enron .  com .  " . 

brogers2@enron . com . 

the . daytrader@enron . com . 

d.  .  thomas@enron.  com .  Paul  D.  Thomas . 

subscriber@mailboy . enron . com . 

stock . option . grant . list@enron . com . 

rshapiro@enron . com . 

ebiz@enron.com .  eBiz . 

j  .  .  vitrella@enron.  com .  David  J.  Vitrella.  .  . 

liz . hillman@enron . com . 

brogers2@ect . enron . com . 

julio.guzman@enron.com .  Julio  Guzman . 

union.credit@enron.com .  Credit  Union . 

danny . wilson@enron . com . 

mcurry@enron . com . 


p(z|u) 

0.000686 

0.000629 

0.000520 

0.000345 

0.000259 

0.000253 

0.000244 

0.000204 

0.000202 

0.000181 

0.000179 

0.000175 

0.000170 

0.000153 

0.000144 

0.000141 

0.000136 

0.000126 

0.000124 

0.000120 


CATEGORY  22 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  4952  COMPONENTS:  12 

LARGEST  COMPONENT  SIZE:  4926  PERCENT  OF  TOTAL  GRAPH:  99.47'/, 
GROUP  DEGREE:  0.07514  GRAPH  DENSITY:  0.00121 

GROUP  CLOSENESS:  0.00197  GROUP  BETWEENNESS:  0.12954 

AVERAGE  p(z | u) :  0.02  STDEVp(zlu):  0.03 


MOST  PROBABLE  USERS 

Topic#  ID#  Email  Address  Name 

22  19980  harry.collins@enron.com . 

22  22502  gary.a.hanks@enron.com .  gary .  a .  hanks  . 

22  18664  brian.lindsay@enron.com .  Brian  Lindsay 

22  15567  earl.tisdale@enron.com . 

22  37910  amy.heffernan@enron.com . 

22  2995  juana.fayett@enron.com .  Juana  Fayett. 

22  250  cynthia.clark@enron.com .  Cynthia  Clark 

22  18180  sonya.clarke@enron.com . 


p(z|u) 

0.001380 

0.001323 

0.001213 

0.001114 

0.001111 

0.001107 

0.001098 

0.001034 
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22 


Jana  Morse 


0.000976 


6265  jana.morse@enron.com 

22  17783  paul.maley@enron.com .  Paul  Maley. 

22  2186  robbi.rossi@enron.com .  Robbi  Rossi 

22  2993  trang.le@enron.com .  Trang  Le... 


22  19791  nicole.hunter@enron.com . 

22  19629  albert.escamilla@enron.com . 

22  8876  tandra.coleman@enron.com .  Tandra  Coleman 

22  18181  tim.davies@enron.com . 

22  22068  enron.counterparty@enron.com . 


22  29291  .cooper@enron.com .  ebs . 

22  1094  karen.o* day@enron.com .  Karen  day 

22  1202  center.eol@enron.com . 


sit***********************************************************************: 

CATEGORY  23 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  8278  COMPONENTS:  16 

LARGEST  COMPONENT  SIZE:  8212  PERCENT  OF  TOTAL  GRAPH:  99. 20°/. 
GROUP  DEGREE:  0.08240  GRAPH  DENSITY:  0.00085 

GROUP  CLOSENESS:  0.00039  GROUP  BETWEENNESS:  0.14980 

AVERAGE  p(z | u) :  0.02  STDEVp(zlu):  0.02 


MOST  PROBABLE  USERS 


Topic#  ID# 

23 

20676 

23 

15274 

23 

2363 

23 

20547 

23 

772 

23 

2501 

23 

8764 

23 

23522 

23 

650 

23 

64095 

23 

11174 

23 

779 

23 

7028 

23 

8357 

23 

655 

23 

353 

23 

64088 

23 

15546 

23 

2188 

Email  Address 

transportation . parking@enron . com 
parking . transportation@enron . com 

morris . larubbio@enron . com . 

lance . j  ameson@enron . com . 

laura . vuittonet@enron . com . 

kimberly . bates@enron . com . 

j  ennif er . mendez@enron . com . 

j  ohn . o ’ conner@enron . com . 

j  ae . black@enron . com . 

devries@enron . com . 

lindsay . culotta@enron . com . 

becky . young@enron . com . 

j  oseph . nieten@enron . com . 

airam . arteaga@enron . com . 

karen . buckley@enron . com . 

mark . symms@enron .com . 

mcmichael@enron . com . 

laura . harder@enron . com . 

s . . gartner@enron .com . 


Name 

Parking  &  Transporta 
Parking  &  Transporta 
Morris  Larubbio . 


Laura  Vuittonet . 

Kimberly  Bates . 

Jennifer  Mendez . 

John  Conner . 

Tamara  Jae  Black. . . . 


Becky  Young. . 
Joseph  Nieten 


Karen  Buckley 
Mark  Symms . . . 


Julie  S.  Gartner.... 


0.000941 
0.000880 
0.000857 
0 . 000844 
0.000810 
0.000769 
0.000745 
0.000702 
0.000643 
0.000641 
0.000625 

****************************************** 


p(z|u) 

0.000669 

0.000557 

0.000539 

0.000530 

0.000507 

0.000497 

0.000486 

0.000480 

0.000467 

0 . 000440 

0.000438 

0.000427 

0.000426 

0.000424 

0.000417 

0.000413 

0.000393 

0.000376 

0.000358 
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23 


6097  simone.rose@enron.com 


0.000352 


CATEGORY  24 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  7781  COMPONENTS:  19 

LARGEST  COMPONENT  SIZE:  7727  PERCENT  OF  TOTAL  GRAPH:  99.31'/. 
GROUP  DEGREE:  0.11346  GRAPH  DENSITY:  0.00077 

GROUP  CLOSENESS:  0.00054  GROUP  BETWEENNESS:  0.20979 

AVERAGE  p(z I u) :  0.02  STDEVp(zlu):  0.01 


MOST  PROBABLE  USERS 


Topic#  ID# 

24 

7009 

24 

15206 

24 

1599 

24 

2505 

24 

8829 

24 

52900 

24 

14777 

24 

3520 

24 

35820 

24 

6943 

24 

465 

24 

72408 

24 

60731 

24 

48843 

24 

81783 

24 

2548 

24 

81795 

24 

773 

24 

16324 

24 

71735 

Email  Address 

j  ay . knoblauh@enron .com . 

j  ef  f . stephensSenron . com . 

john.disturnal@enron.com. . . . 
mara.bronstein@enron.com. . . . 
jeffery.stephens@enron.com. . 

’williams@enron. com . 

kay . young@enron .com . 

ragan.bond@enron. com . 

greg . f rersSenron . com . 

gregory .  schockling@enron .  com 
dipak.agarwalla@enron.com. . . 

’ f ildesSenron .com . 

turner@enron  .com . 

r oger_yang@enron .com . 

woodruff  @enron.  com . 

f . . keavey@enron .com . 

’ bevansSenron .com . 

. ward@enr on .com . 

tracey . kar i@enr on .com . 

1  hadix@enron  .com . 


Name 

Jay  Knoblauh. 
Jeff  Stephens 


Mara  Bronstein . 

Jeffery  Stephens. . . . 


Ragan  Bond . 

Greg  Frers . 

Gregory  Schockling. . 
Dipak  Agarwalla . 


Peter  F.  Keavey 


houston 


p(z|u) 
0.000818 
0 . 000746 
0.000741 
0.000708 
0.000613 
0.000534 
0.000531 
0.000524 
0.000454 
0.000421 
0.000389 
0 . 000347 
0 . 000344 
0.000313 
0.000302 
0.000287 
0.000278 
0.000273 
0.000272 
0.000272 


CATEGORY  25 

EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  8513  COMPONENTS:  19 

LARGEST  COMPONENT  SIZE:  8459  PERCENT  OF  TOTAL  GRAPH:  99.37'/, 
GROUP  DEGREE:  0.04440  GRAPH  DENSITY:  0.00094 

GROUP  CLOSENESS:  0.00052  GROUP  BETWEENNESS:  0.06979 

AVERAGE  p(z | u) :  0.02  STDEVp(zlu):  0.02 


215 


MOST  PROBABLE  USERS 


Topic#  ID# 

25 

1895 

25 

1889 

25 

6180 

25 

1914 

25 

1894 

25 

17603 

25 

2377 

25 

1497 

25 

6226 

25 

87 

25 

54525 

25 

4762 

25 

17421 

25 

13404 

25 

5688 

25 

15799 

25 

88 

25 

17604 

25 

55972 

25 

2368 

Email  Address  Name 

dl-ga-all_enron_worldwide2@enron.com. . . .  DL-GA-all_enron_worl 
dl-ga-all_enron_worldwidel@enron.com. . . .  DL-GA-all_enron_worl 

enron . expertf inder@enron . com . 

chairman.ken@enron.com .  Ken  Lay  -  Office  of 

chairman.enron@enron.com .  Enron  Office  Of  The 

dl-ga-all_enron_worldwide5@enron.com. . . .  DL-GA-ALL_enron_worl 

dl-ga-all_enron_worldwide@enron . com .  DL-GA-all_enron_worl 

legalonline-compliance@enron.com .  Office  of  the  Chairm 

enron . announcement@enron . com . 

enron.chairman@enron.com .  Enron  Americas  -  Off 

dl-ga-all_egs@enron .  com .  DL-GA-all_egs . 

office . chairman@enron . com . 

dl-ga-all_enron_worldwide4@enron.com. . . .  DL-GA-all_enron_worl 

deane.pierce@enron.com .  Deane  Pierce . 

ken . skilling@enron . com . 

all . employees@enron . com . 

ena.employees@enron.com .  ENA  Employees . 

dl-ga-all_enron_worldwide6@enron.com. . . .  DL-GA-ALL_enron_worl 

lprior@enron.com .  " . 

coo.jeff@enron.com .  Jeff  McMahon  -  Presi 


p(z|u) 

0.000964 

0.000859 

0.000859 

0.000818 

0.000807 

0.000785 

0.000732 

0.000714 

0.000693 

0.000674 

0.000659 

0.000656 

0.000640 

0.000634 

0.000618 

0.000618 

0.000581 

0.000579 

0.000578 

0.000577 


CATEGORY  26 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  4645  COMPONENTS:  16 

LARGEST  COMPONENT  SIZE:  4609  PERCENT  OF  TOTAL  GRAPH:  99.22'/, 
GROUP  DEGREE:  0.12181  GRAPH  DENSITY:  0.00086 

GROUP  CLOSENESS:  0.00099  GROUP  BETWEENNESS:  0.19950 

AVERAGE  p(z | u) :  0.02  STDEVp(zlu):  0.01 


MOST  PROBABLE  USERS 


Topic#  ID# 
26  6031 
26  215 
26  7112 
26  34175 
26  5377 
26  6083 
26  39763 
26  39764 


Email  Address  Name 

outlook . team@enron . com . 

stacey . white@enron . com . 

zionette.vincent@enron.com .  Zionette  Vincent.... 

kathryn.thomas@enron.com .  Kathryn  Thomas . 

sean . long@enron . com . 

mike . thomas@enron . com . 

Valeria . a . hope@enron . com . 

roxann . salina . enronxgate@enron . com . 


p(z|u) 

0.001577 

0.001282 

0.000583 

0.000353 

0.000286 

0.000262 

0.000231 

0.000229 
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26 

21095 

renee.pena@enron.com . . 

.  Renee 

Pena . 

26 

21091 

jerry.harkreader@enron.com . . 

.  Jerry 

Harkreader .... 

26 

20005 

cuthbert.roberts@enron.com . . 

26 

17830 

cheryl .  oliver@enron .  com . . 

26 

6077 

john.reese@enron.com . . 

26 

39765 

vance.bates@enron.com . . 

26 

20013 

david.terlip@enron.com . . 

26 

19903 

cooper@enron.com . . 

26 

30858 

enron.customers@enron.com . . 

26 

11357 

ching .  lun@enron .  com . . 

26 

1782 

j  oe  .  zhou@enron .  com . . 

26 

1116 

todd.decook@enron.com . . 

.  Todd  Decook . 

0.000188 

0.000180 

0.000176 

0.000172 

0.000165 

0.000157 

0.000154 

0.000151 

0.000144 

0.000139 

0.000136 

0.000135 


****************************************************************************************************************** 
CATEGORY  27 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  6516  COMPONENTS:  15 

LARGEST  COMPONENT  SIZE:  6478  PERCENT  OF  TOTAL  GRAPH:  99. 42°/. 
GROUP  DEGREE:  0.13563  GRAPH  DENSITY:  0.00077 

GROUP  CLOSENESS:  0.00102  GROUP  BETWEENNESS:  0.25973 

AVERAGE  p(z | u) :  0.02  STDEVp(zlu):  0.01 


MOST  PROBABLE  USERS 


Topic#  ID# 

27 

14855 

27 

703 

27 

642 

27 

1878 

27 

62798 

27 

1383 

27 

16849 

27 

8917 

27 

14834 

27 

62801 

27 

62784 

27 

1881 

27 

2614 

27 

8771 

27 

66671 

27 

37969 

27 

15272 

27 

8776 

27 

37170 

Email  Address  Name 

luis . mena@enron . com . 

matthew .  lenhart@enron .  com .  Matthew  Lenhart 

eric.bassOenron.com .  Eric  Bass . 

bryan.hull@enron.com .  Bryan  Hull . 

matt.sampleOenron.com .  Matt  Sample.... 

nick.hiemstraOenron.com .  Nick  Hiemstra.  . 

dbaughmOenron .com . 


timothy . blanchardOenron . com . 

micah . hattenOenron . com . 

allan.elliottOenron.com .  Allan  Elliott . 

michael.cherryOenron.com .  Michael  Cherry . 

greg.martinOenron.com .  Greg  Martin . 

christa.winfreyOenron.com .  Christa  Winfrey . 

jackson.loganOenron.com .  Jackson  Logan  III... 

moscosoOenron . com . 

molnar . markOenron . com . 

phillip.loveOenron.com .  Phillip  Love . 

thomas.underwoodOenron.com .  Thomas  Underwood.... 

’  allison’Oenron.  com .  Allison . 


p(z|u) 

0.000894 

0.000597 

0.000520 

0.000459 

0.000442 

0.000415 

0.000414 

0.000400 

0.000391 

0.000378 

0.000377 

0.000360 

0.000321 

0.000317 

0.000278 

0.000274 

0.000266 

0.000260 

0.000250 
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27  19276  nicholas.stephan@enron.com 


0.000247 


CATEGORY  28 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  7415  COMPONENTS:  16 

LARGEST  COMPONENT  SIZE:  7374  PERCENT  OF  TOTAL  GRAPH:  99. 45°/. 
GROUP  DEGREE:  0.10942  GRAPH  DENSITY:  0.00094 

GROUP  CLOSENESS:  0.00078  GROUP  BETWEENNESS:  0.16977 

AVERAGE  p(z | u) :  0.02  STDEV  p (z I u) :  0.02 


MOST  PROBABLE  USERS 


Topic#  ID# 

28 

6007 

28 

38660 

28 

797 

28 

1017 

28 

83556 

28 

83555 

28 

34952 

28 

6998 

28 

2118 

28 

19782 

28 

785 

28 

30830 

28 

6861 

28 

811 

28 

20965 

28 

641 

28 

24665 

28 

54075 

28 

1315 

28 

6891 

Email  Address 

houston . report@enron . com . 

esaibi@enron . com . 

sap_security@enron . com . 

settlements . ees@enron . com . 

subscribers@mailman . enron . com . 
weatherwarn@mailman . enron . com . 

isc . groups@enron .com . 

j  ef f rey . j  ackson@enron . com . 

notification.isc@enron.com. . . . 

enron . users@enron . com . 

j  arod . j  enson@enron . com . 

phillip . randle@enron . com . 

david . wile@enron .com . 

administration . enron@enron . com 

sue . rich@enron . com . 

michael . barber@enron . com . 

stephen.harrington@enron.com. . 

hotline . isc@enron . com . 

daniel . lisk@enron . com . 

gail.kettenbrink@enron.com. . . . 


Name  p(z|u) 

.  0.001461 

.  0.001421 

.  0.001377 

EES  Power  Settlement  0.001345 

.  0.001331 

.  0.001307 

.  0.000883 

Jeffrey  Jackson .  0.000831 

ISC  Systems  Notifica  0.000638 


.  0.000636 

Jarod  Jenson .  0.000538 

Phillip  Randle .  0.000511 

David  Wile .  0.000510 


Enron  Messaging  Admi  0.000465 
.  0.000459 


Michael  Barber .  0.000454 

.  0.000443 

ISC  Hotline, .  0.000441 

Daniel  Lisk .  0.000438 

Gail  Kettenbrink. . . .  0.000434 


CATEGORY  29 

EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  8140  COMPONENTS:  13 

LARGEST  COMPONENT  SIZE:  8100  PERCENT  OF  TOTAL  GRAPH:  99.517, 
GROUP  DEGREE:  0.09964  GRAPH  DENSITY:  0.00086 

GROUP  CLOSENESS:  0.00079  GROUP  BETWEENNESS:  0.14979 

AVERAGE  p(z | u) :  0.02  STDEVp(zlu):  0.01 
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MOST  PROBABLE  USERS 


Topic#  ID# 

29 

22613 

29 

30920 

29 

35461 

29 

39519 

29 

82058 

29 

77620 

29 

59958 

29 

58588 

29 

11411 

29 

17265 

29 

44859 

29 

24866 

29 

4556 

29 

2258 

29 

35562 

29 

79405 

29 

2453 

29 

29712 

29 

17341 

29 

41633 

Email  Address  Name 

tdonoho@enron . com . 

kwatsonOenron . com . 

tmartin@ect . enron . com . 

jquenet@enron.com .  Quenet . 

mwhitt@ect . enron .com . 

f ermis@ect . enron.com . 

tdonoho@ect . enron . com . 

"undisclosed-recipient "@enron. com .  "Undisclosed-Recipie 

phyllis . anzalone@enron . com . 

fsturm@enron.com .  f sturm . 

mwhitt@enron . com . 

lindy . donoho@enron . com . 

mlenhart@enron . com . 

rita.hartfield@enron.com .  Rita  Hartfield . 

esource@enron.com .  eSource . 

gblair@enron . com . 

undisclosed-recipients@enron . com .  undisclosed-recipien 

vkamins@ect . enron . com . 

rbuy@enron .com . 

f king@enron . com . 


CATEGORY  30 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  7437  COMPONENTS:  18 

LARGEST  COMPONENT  SIZE:  7383  PERCENT  OF  TOTAL  GRAPH:  99.27°/, 
GROUP  DEGREE:  0.10751  GRAPH  DENSITY:  0.00081 

GROUP  CLOSENESS:  0.00053  GROUP  BETWEENNESS:  0.15977 

AVERAGE  p(z | u) :  0.02  STDEVp(zlu):  0.02 


MOST  PROBABLE  USERS 


Topic#  ID#  Email  Address 


30 

30 

30 

30 

30 

30 

30 

30 


31503  grant_masson@ei.enron.com. 
2082  kenneth.deng@enron.com. . . . 

19763  mary.bailey@enron.com . 

30940  vince_ j  _kaminski@enron . com 
32865  network . security@enron . com 
10959  althea.gordon@enron.com... 

2078  jason.sokolov@enron.com... 
31619  lenos.trigeorgis@enron.com 


Name 


Kenneth  Deng 


"Vince  J  Kaminski".. 


Jason  Sokolov 


p(z|u) 

0.001113 

0.000933 

0.000874 

0.000835 

0.000827 

0.000745 

0.000729 

0.000652 

0.000646 

0.000632 

0.000609 

0.000561 

0.000535 

0.000467 

0.000373 

0.000363 

0.000332 

0.000331 

0.000328 

0.000315 


p(z|u) 

0.000438 

0.000435 

0.000361 

0.000353 

0.000352 

0 . 000346 

0.000341 

0.000330 
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30 

9251 

rehman . sharif @enr on . com . 

.  0.000327 

30 

19479 

nedre . strambler@enron . com . 

.  0.000326 

30 

32307 

andrew . s . f astowOenron . com . 

.  Andrew . 

S .Fastow. . . . 

.  0.000320 

30 

30722 

ben j  amin . par sonsOenron . com . 

.  0.000316 

30 

8822 

kenneth . parkhillOenron . com . 

.  Kenneth 

Parkhill . . . 

.  0.000305 

30 

8909 

a . . coteOenron .com . 

.  John  A. 

Cote . 

.  0.000299 

30 

30143 

vince . j . kaminskiOenron . com . 

.  Vince  J 

"  "Kaminski. 

.  0.000289 

30 

18538 

Stephanie . taylorOenron . com . 

.  Stephanie  Taylor... 

.  0.000283 

30 

53111 

kef f er . lesleyOenron . com . 

.  Lesley 

Kef f er . 

.  0.000278 

30 

29453 

vkaminsOenron . com . 

.  0.000274 

30 

29971 

kaminsOenron . com . 

.  kamins@enron.com... 

.  0.000267 

30 

20956 

maggie . li@enron . com . 

.  0.000260 

******************************************************************************************************************* 
CATEGORY  31 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  6597  COMPONENTS:  19 

LARGEST  COMPONENT  SIZE:  6546  PERCENT  OF  TOTAL  GRAPH:  99.23"/, 
GROUP  DEGREE:  0.12270  GRAPH  DENSITY:  0.00076 

GROUP  CLOSENESS:  0.00062  GROUP  BETWEENNESS:  0.21973 

AVERAGE  p(z | u) :  0.02  STDEVp(zlu):  0.01 

MOST  PROBABLE  USERS 


Topic#  ID# 

Email  Address 

Name 

p(z|u) 

31 

2005 

thomas . lowellOenron . com . 

.  Thomas 

Lowell . . . . 

. . .  0.000976 

31 

77707 

plucciOenron . com . 

.  Paul  Lucci . 

. . .  0.000781 

31 

323 

seung-taek . ohOenron . com . 

.  Seung-Taek  Oh. . . . 

. . .  0.000579 

31 

620 

ryan . williamsOenron . com . 

.  Ryan  Williams. . . . 

. . .  0.000393 

31 

494 

kevin . clineOenron . com . 

.  Kevin 

Cline . 

. . .  0.000376 

31 

15540 

natalie . wellsOenron . com . 

.  Natalie  Wells.... 

. . .  0.000212 

31 

3605 

t . . lucciOenron . com . 

.  Paul  T 

.  Lucci . . . . 

. . .  0.000175 

31 

11422 

j im . branif f @enron . com . 

. . .  0.000175 

31 

15042 

danny . leeOenron . com . 

.  Danny 

Lee . 

. . .  0.000173 

31 

44231 

sap . hotline@enron . com . 

. . .  0.000157 

31 

5807 

general . announcement@enron . com . 

. . .  0.000148 

31 

55961 

f ermisOenron. com . 

.  " . 

. . .  0.000148 

31 

32548 

j  ohn . kiani@enron .com . 

.  . .  0.000144 

31 

9360 

andrew . zabriskieOenron . com . 

. . .  0.000136 

31 

77887 

pourchot@enron . com . 

. . .  0.000136 

31 

37504 

sidrac . f loresOenron . com . 

.  Sidrac 

Flores . . . . 

. . .  0.000131 

31 

11181 

sarah . driscollOenron . com . 

. . .  0.000130 

31  12468  richard.orellana3enron.com .  Richard  Orellana....  0.000124 

31  64983  jackson.voSenron.com .  0.000124 
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31  33734  sarah.zarkowsky@enron.com 


Sarah  Zarkowsky 


0.000123 


CATEGORY  32 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  7636  COMPONENTS:  14 

LARGEST  COMPONENT  SIZE:  7596  PERCENT  OF  TOTAL  GRAPH:  99.487, 
GROUP  DEGREE:  0.11694  GRAPH  DENSITY:  0.00079 

GROUP  CLOSENESS:  0.00089  GROUP  BETWEENNESS:  0.20977 

AVERAGE  p(z | u) :  0.02  STDEV  p (z I u) :  0.01 


MOST  PROBABLE  USERS 

Topic#  ID#  Email  Address  Name  p(z|u) 

32  8722  home.owner@mailman.enron.com .  0.001089 

32  8629  undisclosed.recipients@mailman.enron.com  .  0.000943 

32  22643  pmims@enron.com .  0.000837 

32  81740  plucci@ect.enron.com .  0.000801 

32  33798  us.home.owner@mailman.enron.com .  0.000708 

32  8779  valued.client@mailman.enron.com .  0.000704 

32  2347  h.  .  lewis@enron.  com .  Andrew  H.  Lewis .  0.000661 

32  22403  dfarmer@ect.enron.com .  0.000636 

32  62384  mwoodson@enron.com .  0.000530 

32  70877  list.subscriber@mailman.enron.com .  0.000530 

32  6580  angela.barnett@enron.com .  Angela  Barnett .  0.000516 

32  79414  lblair@enron.com .  0.000512 

32  29515  postmaster@mailboy.enron.com .  0.000496 

32  70836  schlenker@mailman.enron.com .  0.000487 

32  29641  aol.users@mailman.enron.com .  0.000479 

32  8759  valued.home.owner@mailman.enron.com .  0.000474 

32  8799  event@mailman.enron.com .  0.000466 

32  71892  extramoney@mailman.enron.com .  0.000466 

32  77844  jreitme@enron.com .  0.000461 

32  82621  valued.customer@mailman.enron.com .  0.000457 


CATEGORY  33 

EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  7619  COMPONENTS:  18 

LARGEST  COMPONENT  SIZE:  7568  PERCENT  OF  TOTAL  GRAPH:  99.337, 
GROUP  DEGREE:  0.10614  GRAPH  DENSITY:  0.00079 

GROUP  CLOSENESS:  0.00054  GROUP  BETWEENNESS:  0.15977 
AVERAGE  p(z | u) :  0.02  STDEVp(zlu):  0.02 
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MOST  PROBABLE  USERS 


Topic#  ID# 

33 

37111 

33 

37443 

33 

3580 

33 

3354 

33 

4581 

33 

52959 

33 

62800 

33 

31826 

33 

24509 

33 

68290 

33 

85652 

33 

81781 

33 

38921 

33 

37222 

33 

20246 

33 

52960 

33 

78906 

33 

46483 

33 

87152 

33 

53983 

Email  Address 

. stephens@enron . com . 

’ f  ennerOenron .com . 

tiffany . smithOenron . com . . 

tom . ward@enron . com . 

mdayOenron .com . 

’ thompson@enron . com . 

beth.cherry@enron.com. . . . 
’ "sbigalow"@enron. com. . . . 

’ proct or@enron . com . 

a . f ishkin@enron . com . 

vo . hoang@enron . com . 

’ j  ernigan@enron . com . 

j  ennif er . ballas@enron . com 

k. c@enron. com . 

valerie . curtis@enron . com . 

’ lipper@enron . com . 

’ketcherside@enron.com. . . 

’ ward@enron . com . 

’ rector@enron .com . 

’ thomas@enron . com . 


Name 

bridgeline 


Tiffany  Smith 
TOM  WARD . 


Beth  Cherry 
"sbigalow" . 


Charles  A  Fishkin. . . 
Hoang  Vo . 


Jennifer  Balias 


CATEGORY  34 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  7887  COMPONENTS:  22 

LARGEST  COMPONENT  SIZE:  7794  PERCENT  OF  TOTAL  GRAPH:  98.827, 
GROUP  DEGREE:  0.06480  GRAPH  DENSITY:  0.00089 

GROUP  CLOSENESS:  0.00026  GROUP  BETWEENNESS:  0.08978 

AVERAGE  p(z | u) :  0.02  STDEVp(zlu):  0.02 


MOST  PROBABLE  USERS 


Topic#  ID# 
34  7514 
34  18854 
34  7528 
34  14946 
34  3079 
34  18901 
34  26919 
34  14574 


Email  Address 

craig . rickard@enron . com . 

teresa . aguilera-peon@enron . com 

t . . robinson@enron . com . 

cecil . stapley@enron.com . 

r . . conner@enron . com . 

dirk . dimitry@enron . com . 

j  anet . bowers@enron . com . 

mike . barry@enron .com . 


Name 

Craig  Rickard . 

Maria  Teresa  Aguiler 
Richard  T.  Robinson. 


Andrew  R.  Conner. . . . 

Dirk  Dimitry . 

Janet  Bowers . 

Mike  Barry . 


p(z|u) 

0.001161 

0.000901 

0.000875 

0.000854 

0.000836 

0.000798 

0.000772 

0.000748 

0.000744 

0.000738 

0.000725 

0.000723 

0.000720 

0.000693 

0.000682 

0.000669 

0.000652 

0.000632 

0.000624 

0.000613 


p(z|u) 

0.001114 

0.001076 

0.000975 

0.000967 

0.000897 

0.000877 

0.000875 

0.000746 


34 


0.000745 


34 

34 

34 

34 

34 

34 

34 

34 

34 

34 

34 


36470  larry.swett8enron.com . 

44124  lisa.valleySenron.com . 

15268  eileen.peeblesSenron.com . 

17270  tim.johansonSenron.com . 

18877  greg.bruchSenron.com . 

43984  loren.penkavaSenron.com . 

11195  morela.hernandez3enron.com. . . . 
18923  elizabeth .hut chins onSenr on . com 

1364  laura.lantefield3enron.com. . . . 

1617  jennifer.fraserSenron.com . 

41985  allen.cohrsSenron.com . 

43946  larry.pavlouSenron.com . 


.  0.000703 

.  0.000683 

.  0.000682 

Greg  Bruch .  0.000627 

.  0.000626 

.  0.000607 

Elizabeth  Hutchinson  0.000606 

Laura  Lantefield. . . .  0.000600 

.  0.000562 

.  0.000542 

.  0.000516 


******************************************************************************************************************* 
CATEGORY  35 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  6630  COMPONENTS:  17 

LARGEST  COMPONENT  SIZE:  6569  PERCENT  OF  TOTAL  GRAPH:  99.08"/, 
GROUP  DEGREE:  0.14227  GRAPH  DENSITY:  0.00091 

GROUP  CLOSENESS:  0.00051  GROUP  BETWEENNESS:  0.23975 

AVERAGE  p(z | u) :  0.02  STDEVp(zlu):  0.01 


MOST  PROBABLE  USERS 


Topic#  ID# 
35  548 
35  20483 
35  22685 
35  16217 
35  53196 
35  6837 
35  35590 
35  36452 
35  18727 
35  19951 
35  53747 
35  6580 
35  19277 
35  81470 
35  9260 
35  16087 
35  86643 
35  6992 
35  71827 


Email  Address  Name 

jeff.kingSenron.com .  Jeff  King 


amber . limasSenr on .com . 

Imay23enr  on  .com . 

rose . botelloSenron.com . 

davette . warrenSenron . com . 

diane.salcidoSenron.com .  Diane  Salcido . 

gur  leySenron  .com . 

juantongia. calvinSenron . com . 

regina.blackshearSenron.com .  Regina  Blackshear.  .  . 

amber . ebowSenron .com . 

1  .millerSenron.  com .  Chris  L  Miller . 

angela.barnettSenron.com .  Angela  Barnett . 

benjamin. freemanSenr on. com . 

.carrollSenron.com .  e-mail . 

don . stevensSenron . com . 

mudd’  .  ’lisaSenron .  com .  Lisa  Mudd . 

kevin_hyattSenr on .com . 

judy.hernandezSenron.com .  Judy  Hernandez . 

ddavisSenron.  com . 


p(z|u) 

0.001014 

0.000963 

0.000958 

0.000802 

0.000703 

0.000687 

0.000526 

0.000481 

0.000479 

0.000428 

0.000417 

0.000411 

0.000406 

0.000393 

0.000384 

0.000374 

0.000355 

0.000354 

0 . 000347 
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35  19800  marilyn.riveraSenron.com 


0.000345 


CATEGORY  36 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  7447  COMPONENTS:  16 

LARGEST  COMPONENT  SIZE:  7407  PERCENT  OF  TOTAL  GRAPH:  99.46'/, 
GROUP  DEGREE:  0.06138  GRAPH  DENSITY:  0.00094 

GROUP  CLOSENESS:  0.00083  GROUP  BETWEENNESS:  0.07973 

AVERAGE  p(z I u) :  0.02  STDEVp(zlu):  0.01 


MOST  PROBABLE  USERS 


Topic#  ID# 

36 

4110 

36 

14521 

36 

14112 

36 

18017 

36 

1911 

36 

57681 

36 

12940 

36 

13411 

36 

12993 

36 

419 

36 

7634 

36 

49937 

36 

24910 

36 

9171 

36 

21873 

36 

12952 

36 

18587 

36 

32139 

36 

7008 

36 

21872 

Email  Address 

klaySenron.com . 

qwest .netSmailman . enron . com . 

If anSmailman. enron. com . 

ruth . mannSenron .com . 

resources. humanSenron. com . 

munder .mail . listSmailman. enron.com 

mike . underwoodSenr on .com . 

rebecca.longoriaSenron.com . 

Cecil .  stinemetzSenron.  com . 

Steven . burnhamSenr on .com . 

s . . smithSenr on .com . 

r sander sSenr on .com . 

celestine.hollan3enron.com . 

sally  .hsiehSenron .  com . 

j ef f  . borgSenron .com . 

johnnie . nelsonSenron . com . 

patricia.weatherspoonSenron.com. . . 

jnordenSenron . com . 

j  ona . kimbroughSenr on .com . 

dave . ellisSenron.com . 


Name 


" smoray ! qwest . net " . . 
"If an" . 


Human  Resources 


Mike  Underwood. . 
Rebecca  Longoria 
Cecil  Stinemetz. 
Steven  Burnham. . 
Robert  S.  Smith. 


Jeff  Borg . 

Johnnie  Nelson . 

Patricia  Weatherspoo 


Jona  Kimbrough 
Dave  Ellis .... 


p(z|u) 

0.001302 

0.000498 

0.000305 

0.000236 

0.000214 

0.000143 

0.000140 

0.000140 

0.000139 

0.000131 

0.000130 

0.000129 

0.000124 

0.000117 

0.000117 

0.000115 

0.000114 

0.000113 

0.000109 

0.000108 


CATEGORY  37 

EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  7197  COMPONENTS:  16 

LARGEST  COMPONENT  SIZE:  7145  PERCENT  OF  TOTAL  GRAPH:  99.28'/, 
GROUP  DEGREE:  0.07675  GRAPH  DENSITY:  0.00083 

GROUP  CLOSENESS:  0.00052  GROUP  BETWEENNESS:  0.10975 
AVERAGE  p(z | u) :  0.02  STDEVp(zlu):  0.03 
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MOST  PROBABLE  USERS 


Topic#  ID# 

37 

22503 

37 

22763 

37 

18724 

37 

11081 

37 

6067 

37 

6070 

37 

11116 

37 

20241 

37 

22446 

37 

35828 

37 

3006 

37 

22443 

37 

14684 

37 

8968 

37 

11087 

37 

9211 

37 

3007 

37 

14685 

37 

20244 

37 

6068 

Email  Address 

carlos . j . rodriguezOenron . com 

briley@enron . com . 

kevin.alvarado@enron.com. . . . 
rebecca.griffin@enron.com. . . 
alvin.thompson@enron.com. . . . 

scott . loving@enron . com . 

megan . parker@enron . com . 

susan . hadix@enron . com . 

juliann . kemp@enron . com . 

greg . mann@enron . com . 

robert.ramirez@enron.com. . . . 
benjcimin.schoene@enron.com.  . 

robert . cotten@enron . com . 

h. .fletcher@enron.com . 

katherine . herrera@enron . com . 

aimee . lannou@enron . com . 

1 . . dinari@enron . com . 

kate . f raser@enron . com . 

laurie . ellis@enron . com . 

j  oe . casas@enron . com . 


Name 

carlos . j . rodriguez . . 


Kevin  Alvarado. 
Rebecca  Griffin 


Megan  Parker 


Greg  Mann . 

Robert  Ramirez . 

Benjamin  Schoene . . . . 
Robert  Cotten . 


Katherine  Herrera. . . 


Sabra  L.  Dinar i 
Kate  Fraser .... 


p(z|u) 

0.001368 

0.001246 

0.000992 

0.000922 

0.000914 

0.000890 

0.000877 

0.000772 

0.000739 

0.000721 

0.000695 

0.000695 

0.000693 

0.000691 

0.000653 

0.000632 

0.000628 

0.000621 

0.000619 

0.000596 


CATEGORY  38 

EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  7983  COMPONENTS:  13 

LARGEST  COMPONENT  SIZE:  7942  PERCENT  OF  TOTAL  GRAPH:  99.28'/, 
GROUP  DEGREE:  0.10007  GRAPH  DENSITY:  0.00088 

GROUP  CLOSENESS:  0.00076  GROUP  BETWEENNESS:  0.15980 

AVERAGE  p(z | u) :  0.02  STDEVp(zlu):  0.02 


MOST  PROBABLE  USERS 

Topic#  ID#  Email  Address  Name  p(z|u) 

38  79501  gblairaei.enron.com .  0.000872 

38  26360  dgironaenron.com .  0.000727 

38  56591  tracy_geacconeaenron.com .  0.000724 

38  35021  ploveaect.enron.com .  0.000699 

38  68835  jarnoldaect.enron.com .  0.000673 

38  69656  lcampbeiaect.enron.com .  0.000664 

38  22681  mmotleyaenron.com .  0.000659 

38  85541  tgeaccoaenron.com .  0.000639 


225 


38 


0.000627 


38 

38 

38 

38 

38 

38 

38 

38 

38 

38 

38 


82262  pallen3ect.enron.com... 

4493  lcampbel3enron.com . 

37545  pplatte3enron.com . 

4479  kruscit3enron.com . 

22678  jforney3enron.com . 

37735  kpresto3ect.enron.com.. 
3779  chuck.randall3enron.com 

36000  emclaug3enron.com . 

68870  jarnold3ei.enron.com... 
60337  mlenhart3ect.enron.com. 
26485  dgiron3ect.enron.com... 
30015  kaminsSect.enron.com... 


.  0.000624 

.  0.000622 

.  0.000610 

.  0.000610 

.  0.000602 

.  0.000553 

.  0.000550 

.  0.000547 

Matt  Lenhart .  0 . 000522 

.  0.000519 

kamins3ect.enron.com  0.000516 


******************************************************************************************************************* 
CATEGORY  39 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  8313  COMPONENTS:  19 

LARGEST  COMPONENT  SIZE:  8253  PERCENT  OF  TOTAL  GRAPH:  99.28% 
GROUP  DEGREE:  0.07858  GRAPH  DENSITY:  0.00084 

GROUP  CLOSENESS:  0.00042  GROUP  BETWEENNESS:  0.13980 

AVERAGE  p(z | u) :  0.02  STDEVp(zlu):  0.03 


MOST  PROBABLE  USERS 


Topic#  ID# 

39 

7571 

39 

44164 

39 

44102 

39 

19465 

39 

5076 

39 

778 

39 

5023 

39 

71921 

39 

44142 

39 

7570 

39 

16179 

39 

18727 

39 

44275 

39 

9021 

39 

6948 

39 

69041 

39 

2349 

39 

37179 

39 

16840 

Email  Address 

maria.sandoval@enron.com. . . 
jon.trevelise@enron.com. . . . 
christi.culwell@enron.com. . 
Catherine . dumont@enron . com . 
cindy.shaffer@enron.com. . . . 
ashley.worthing@enron.com. . 
anne.jolibois@enron.com. . . . 

warren . perry@enron . com . 

ron.beidelman@enron.com. . . . 
rebecca.sanchez@enron.com. . 

pearson . ken@enron . com . 

regina . blackshear@enron . com 

paul . pfeff er@enron. com . 

richard.babin@enron.com. . . . 

gary . stadler@enron . com . 

j  oanie . h . ngo@enron . com . 

craig . taylor@enron . com . 

fenner’ . ’mollyQenron. com. . . 
* triolo@enron .com . 


Name 

Maria  Sandoval 


Ashley  Worthing 


Warren  Perry 


Rebecca  Sanchez . 

KEN  PEARSON . 

Regina  Blackshear. . . 


Gary  Stadler 


Craig  Taylor 
Molly  Fenner 


p(z|u) 

0.000804 

0.000754 

0.000717 

0.000714 

0.000677 

0.000676 

0.000615 

0.000609 

0.000602 

0.000598 

0.000589 

0.000582 

0.000576 

0.000573 

0.000568 

0.000547 

0.000543 

0.000540 

0.000527 
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39 


909  .jennifer@enron.com 


e-mail 


0.000526 


CATEGORY  40 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  7067  COMPONENTS:  17 

LARGEST  COMPONENT  SIZE:  7028  PERCENT  OF  TOTAL  GRAPH:  99.45'/, 
GROUP  DEGREE:  0.11454  GRAPH  DENSITY:  0.00071 

GROUP  CLOSENESS:  0.00088  GROUP  BETWEENNESS:  0.18976 

AVERAGE  p(z I u) :  0.02  STDEVp(zlu):  0.01 


MOST  PROBABLE  USERS 


Topic#  ID# 

40 

41733 

40 

22614 

40 

60202 

40 

22317 

40 

2599 

40 

65 

40 

14706 

40 

81838 

40 

83912 

40 

1530 

40 

686 

40 

18884 

40 

7112 

40 

56219 

40 

682 

40 

7482 

40 

15105 

40 

72458 

40 

8644 

40 

78941 

Email  Address 

dperlin@enron . com . 

pkeavey@enron .com . 

peter . f . keavey@enron . com . . 
heidi.dubose@enron.com. . . . 

matt . smith@enron .com . 

anna .  mehr  er@enr  on  .com . 

peter.keavey@enron.com. . . . 

tstaab@enron. com . 

mark . e . tay lor@enron . com . . . 
sandy.morris@enron.com. . . . 

d .  .  hogan@enr  on  .com . 

nyree . chanaba@enron . com . . . 
zionette  .  vincent@enr on .  com 
jane.m.tholt@enron.com. . . . 
claudia.guerra@enron.com. . 
tracy.ramsey@enron.com. . . . 

trey .  cash@enron  .com . 

jtholtSect . enron.com . 

holly.keiser@enron.com. . . . 
kirn. watson@enron. com . 


Name 

"dperlin@enron.com" . 


peter . f . keavey 


Matt  Smith 


Peter  Keavey. 
Theresa  Staab 


Sandy  Morris . 

Irena  D .  Hogan . 

Nyree  Chanaba . 

Zionette  Vincent 

Jane  Tholt . 

Claudia  Guerra . 

Tracy  Ramsey . 

Trey  Cash . 


p(z|u) 

0.001173 

0.001028 

0.000773 

0.000763 

0.000637 

0.000633 

0.000622 

0.000589 

0.000513 

0.000490 

0.000472 

0.000436 

0.000431 

0.000420 

0.000387 

0.000387 

0.000336 

0.000273 

0.000259 

0.000240 


CATEGORY  41 

EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  6821  COMPONENTS:  15 

LARGEST  COMPONENT  SIZE:  6772  PERCENT  OF  TOTAL  GRAPH:  99.28'/, 
GROUP  DEGREE:  0.11865  GRAPH  DENSITY:  0.00088 

GROUP  CLOSENESS:  0.00064  GROUP  BETWEENNESS:  0.19975 

AVERAGE  p(z | u) :  0.02  STDEVp(zlu):  0.02 
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MOST  PROBABLE  USERS 


Topic#  ID# 
41  37217 
41  37146 
41  37115 
41  37220 
41  37151 
41  37218 
41  37114 
41  37159 
41  37223 
41  37122 
41  29102 
41  37216 
41  37161 
41  37157 
41  37152 
41  37147 
41  37145 
41  37189 
41  36662 
41  37190 


Email  Address  Name 

c_r_zander@enron .com . 

fenner.chet@enron.com .  Chet  Fenner . 

chet . f enner@enron . com . 

erwollam@enron . com . 

knipe  ’ .  ’  chad@enron.  com .  chad  knipe . 

feder’ . ’t@enron.com . 

wollam’ . ’erik@enron.com . 

mccomb.keith@enron.com .  Keith  McComb . 

chet_f enner@enron . com . 

wollam . erik@enron . com . 

chambers.john@enron.com .  John  Chambers . 

f eder . t@enron . com . 

mccomb.chris@enron.com .  Chris  McComb . 

Constantine’ . ’brian@enron.com . 

corrier.brad@enron.com .  Brad  Corrier . 

knipe . chad@enron .com .  chad  knipe . 

constantine.brian@enron.com .  Brian  Constantine... 

mccomb’ . ’chris@enron.com .  Chris  McComb . 

sneal@ect . enron . com . 

mccomb’ . ’keith@enron. com .  Keith  McComb . 


p(z|u) 

0.001341 

0.001258 

0.001226 

0.001176 

0.001126 

0.001108 

0.001067 

0.001066 

0.001062 

0.001054 

0.001044 

0.001040 

0.000961 

0.000940 

0.000709 

0.000678 

0.000664 

0.000576 

0.000380 

0.000355 


CATEGORY  42 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  7713  COMPONENTS:  22 

LARGEST  COMPONENT  SIZE:  7651  PERCENT  OF  TOTAL  GRAPH:  99.20'/, 
GROUP  DEGREE:  0.12212  GRAPH  DENSITY:  0.00091 

GROUP  CLOSENESS:  0.00046  GROUP  BETWEENNESS:  0.18979 

AVERAGE  p(z | u) :  0.02  STDEVp(zlu):  0.02 


MOST  PROBABLE  USERS 


Topic#  ID#  Email  Address 


Name 


p(z|u) 


42 

42 

42 

42 

42 

42 

42 

42 


2374  kristin.quinn@enron.com .  Kristin  Quinn . 

12449  dave.lawlor@enron.com . 

18877  greg.bruch@enron.com .  Greg  Bruch . 

2129  .chad@enron.com .  e-mail . 

15719  liz.hillman@enron.com . 

13283  michelle.foust@enron.com .  L  Michelle  Foust.... 

13263  robert.gerry@enron.com .  Robert  Gerry . 

20070  pavel . zadorozhny@enron.com . 


0.000567 

0.000500 

0.000459 

0.000415 

0.000394 

0.000378 

0.000351 

0.000340 
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42 


mpalmer@enron.com...  0.000331 
.  0.000297 


34229  mpalmer@enron.com. . . . 

42  11222  david.marye@enron.com 

42  12474  john.kemp@enron.com .  John  Kemp.. 

42  82051  * appling@enron. com . 

42  428  dan.dorland@enron.com .  Dan  Dorland 


42  30902  christopher.long@enron.com . 

42  11017  the.globalist@enron.com . 

42  49848  ca.team@enron.com . 

42  29903  ze.powergroup.inc.@mailman.enron.com . 

42  33746  louise@enron.com . 

42  76534  gwoulfe@enron.com . 

42  12475  laura.ewald@enron.com .  Laura  Ewald 


0.000272 

0.000272 

0.000271 

0.000266 

0.000257 

0.000257 

0.000245 

0.000226 

0.000224 

0.000212 


******************************************************************************************************************^ 
CATEGORY  43 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  7434  COMPONENTS:  18 

LARGEST  COMPONENT  SIZE:  7388  PERCENT  OF  TOTAL  GRAPH:  99.38"/, 
GROUP  DEGREE:  0.12082  GRAPH  DENSITY:  0.0004 

GROUP  CLOSENESS:  0.00072  GROUP  BETWEENNESS:  0.21977 

AVERAGE  p(z | u) :  0.02  STDEVp(zlu):  0.03 


MOST  PROBABLE  USERS 


Topic#  ID# 
43  50081 
43  50873 
43  41598 
43  41583 
43  49892 
43  24207 
43  49932 
43  52415 
43  34229 
43  2276 
43  34210 
43  8474 
43  50605 
43  21134 
43  49848 
43  49937 
43  49829 
43  24810 
43  37928 


Email  Address  Name 

df ulton@enron .com . 

eletkeSenron .com . 

snovoseOenron . com . 

rf  r  ank@enron  .com . 

hap_boyd@enron.com .  "Hap  Boyd  " 

.sue@enron.com .  e-mail . 

tjohnso8@enron. com . 


johnson. tamara@enron . com . 

mpalmer@enron.com .  mpalmer@enron.com.  .  . 

becky.merola@enron.com .  Becky  Merola . 

shapiro.rick@enron.com . 

david.nutt@enron.com .  David  Nutt . 

james_trudeau@enron.com .  "Jim  Trudeau  " . 

corey.wilkes@enron.com .  Corey  Wilkes . 

ca . team@enron .com . 

r  Sander  s@enr  on  .com . 

bhawkin@enron .com . 

rita . bahnerSenron . com . 

dblackSenron  .com . 


p(z|u) 

0.001103 

0.001050 

0.000965 

0.000957 

0.000945 

0.000877 

0.000875 

0.000820 

0.000795 

0.000786 

0.000766 

0.000716 

0.000715 

0.000704 

0.000701 

0.000693 

0.000638 

0.000623 

0.000613 
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43  48839  smaraOect.enron.com 


Sue"  "Mara 


0.000606 


CATEGORY  44 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  6800  COMPONENTS:  18 

LARGEST  COMPONENT  SIZE:  6747  PERCENT  OF  TOTAL  GRAPH:  99.22'/, 
GROUP  DEGREE:  0.08448  GRAPH  DENSITY:  0.00088 

GROUP  CLOSENESS:  0.00056  GROUP  BETWEENNESS:  0.13972 

AVERAGE  p(z I u) :  0.02  STDEVp(zlu):  0.01 


MOST  PROBABLE  USERS 


Topic#  ID# 

44 

20010 

44 

15530 

44 

7507 

44 

6277 

44 

10561 

44 

29310 

44 

1040 

44 

43991 

44 

77911 

44 

510 

44 

1219 

44 

62765 

44 

19950 

44 

7146 

44 

43107 

44 

64938 

44 

15340 

44 

83620 

44 

3021 

44 

51739 

Email  Address  Name 

cstclaiSenron . com . 

.mom@enron.com .  e-mail . 

crystal.reyna@enron.com .  Crystal  Reyna. 

teresa.mcomber@enron.com .  Teresa  McOmber 

f  ahd . lodi@enron .com . 

claudia.santos@enron.com .  Claudia  Santos 


’  kuehn@enron  .com . 

judy . kudym@enron .com . 

walt.serrano@enron.com .  Walt  Serrano . 

israel.estrada@enron.com .  Israel  Estrada . 

■  gregory@enron.com .  e-mail . 

.zeke@enron.com .  e-mail . 

michelle . laurant@enron. com . 

kristi.demaiolo@enron.com .  Kristi  Demaiolo . 

unspecif ied-recipients@enron. com .  unspecif ied-recipien 

dawson@enron .com . 

lenine . j  eganathanSenron . com . 

ludwig@enron  .com . 

scott.hendrickson@enron.com .  Scott  Hendrickson... 

antoine . duvauchelle@enron. com . 


p(z|u) 

0.000522 

0.000517 

0.000506 

0.000451 

0.000435 

0.000416 

0.000414 

0.000404 

0.000404 

0.000391 

0.000379 

0.000377 

0.000372 

0.000369 

0.000353 

0 . 000349 

0 . 000340 

0 . 000340 

0.000338 

0.000336 


CATEGORY  45 

EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  8824  COMPONENTS:  16 

LARGEST  COMPONENT  SIZE:  8770  PERCENT  OF  TOTAL  GRAPH:  99.39'/, 
GROUP  DEGREE:  0.09991  GRAPH  DENSITY:  0.00079 

GROUP  CLOSENESS:  0.00050  GROUP  BETWEENNESS:  0.16982 

AVERAGE  p(z | u) :  0.02  STDEVp(zlu):  0.02 
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MOST  PROBABLE  USERS 


Topic#  ID# 

45 

15949 

45 

3643 

45 

10757 

45 

2060 

45 

61637 

45 

75867 

45 

66817 

45 

7470 

45 

22 

45 

25314 

45 

26591 

45 

29138 

45 

75299 

45 

82927 

45 

2609 

45 

6857 

45 

15263 

45 

104 

45 

77998 

45 

25315 

Email  Address 

alexandra . salerOenron . com . 

alexandra . villarrealOenron . com 

alma . mart inezOenron . com . 

pete.heintzelman@enron.com. . . . 
charlotte.kraham@enron.com. . . . 

1 j  ewell@enron . com . 

thomas . ’ paul@enron . com . 

kay . quigley@enron . com . 

crystal . hyde@enron . com . 

veselack@enron . com . 

ora . cross@enron . com . 

douglas . nichols@enron . com . 

J  erb@enron .com . 

’ peters@enron . com . 

alex . villarreal@enron . com . 

denae . umbower@enron . com . 

patrice .mims@enron. com . 

laura . wente@enron . com . 

3 mack@enron . com . 

germany . j  r@enron .com . 


Name  p(z|u) 

.  0.000694 

Alexandra  Villarreal  0.000660 


.  0.000585 

Pete  Heintzelman. . . .  0.000557 

.  0.000503 

.  0.000503 

.  0.000468 

Kay  Quigley .  0.000446 

Crystal  Hyde .  0.000444 

.  0.000444 

Ora  Cross .  0.000439 

.  0.000426 

.  0.000422 

.  0.000408 


Alex  Villarreal .  0.000404 

Denae  Umbower .  0 . 000398 

.  0.000391 

.  0.000386 

.  0.000376 

.  0.000373 


CATEGORY  46 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  7360  COMPONENTS:  25 

LARGEST  COMPONENT  SIZE:  7298  PERCENT  OF  TOTAL  GRAPH:  99.167, 

GROUP  DEGREE:  0.10930  GRAPH  DENSITY:  0.00068 

GROUP  CLOSENESS:  0.00042  GROUP  BETWEENNESS:  0.16976 

AVERAGE  p(z | u) :  0.02  STDEVp(zlu):  0.01 

p(z|u) 
0.000665 
0.000638 
0.000552 
0.000492 
0.000476 
0.000475 
0.000422 
0.000414 


MOST  PROBABLE  USERS 

Topic#  ID#  Email  Address  Name 

46  15748  plove3enron.com . 

46  70  mark.brand3enron.com .  Mark  Brand... 

46  673  frank.ermis3enron.com .  Frank  Ermis  .  . 

46  734  dutch.quigley3enron.com .  Dutch  Quigley 

46  44193  1. foust@enron.com . 


46  11221  aaron.martinsen@enron.com 


46  488  mike.carson@enron.com .  Mike  Carson 

46  1129  andy.pace@enron.com .  Andy  Pace.. 
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46 


0.000410 


38006  servello . anthonyOenron . com . 

46  1140  joe.stepenovitch@enron.com .  Joe  Stepenovitch.  .  .  . 

46  2629  bbutler2@enron.com . 

46  1115  clint.dean@enron.com .  Clint  Dean . 

46  3426  mswerzb@ect.enron.com . 

46  82632  lester.terry@enron.com . 

46  53621  duke.kyle@enron.com .  Kyle  Duke . 

46  1952  valerie.ramsower@enron.com .  Valerie  Ramsower.... 

46  37391  .judy@enron.com .  e-mail . 

46  761  m.  .  tholt@enron.  com .  Jane  M.  Tholt . 

46  60337  mlenhart@ect.enron.com .  Matt  Lenhart . 

46  7096  mike.sheedy@enron.com .  Mike  Sheedy . 

CATEGORY  47 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  6248  COMPONENTS:  19 

LARGEST  COMPONENT  SIZE:  6198  PERCENT  OF  TOTAL  GRAPH:  99. 20°/. 
GROUP  DEGREE:  0.08176  GRAPH  DENSITY:  0.00096 

GROUP  CLOSENESS:  0.00062  GROUP  BETWEENNESS:  0.10967 

AVERAGE  p(z | u) :  0.02  STDEVp(zlu):  0.02 


MOST  PROBABLE  USERS 


Topic#  ID# 

47 

566 

47 

1056 

47 

582 

47 

175 

47 

24113 

47 

138 

47 

796 

47 

8792 

47 

490 

47 

2009 

47 

92 

47 

1051 

47 

17096 

47 

1082 

47 

981 

47 

20 

47 

9032 

47 

972 

47 

21 

Email  Address 

evelyn . metoyer@enron . com . 

kerri . thompson@enron . com . 

stephanie.piwetz@enron.com. . . 

lisa.gang@enron.com . 

lgang@enron . com . 

kysa . alport@enron . com . 

shift.dl-portland@enron.com. . 

sitara@enron . com . 

sharen . cason@enron . com . 

alexander . mcelreath@enron . com 
holden.salisbury@enron.com. . . 

j  udy . dyer@enron .com . 

kimberly . allen@enron . com . 

billy . braddock@enron . com . 

Portland . shif t@enron . com . 

geir . solberg@enron . com . 

jennifer .blay@enron. com . 

shift . portland@enron . com . 

kate . symes@enron .com . 


Name 

Evelyn  Metoyer . 

Kerri  Thompson . 

Stephanie  Piwetz. . . . 


DL-Portland  Real  Tim 

Sitara . 

Sharen  Cason . 

Alexander  McElreath. 
Holden  Salisbury. . . . 
Judy  Dyer . 


Billy  Braddock 
Portland  Shift 
Geir  Solberg. . 


Portland  Shift 
Kate  Symes .... 


0.000392 

0.000365 

0.000326 

0.000300 

0.000295 

0.000291 

0.000280 

0.000275 

0.000268 

0.000268 

0.000259 

****************************************** 


p(z|u) 

0.001523 

0.001469 

0.001131 

0.001077 

0.000870 

0.000707 

0.000694 

0.000674 

0.000656 

0.000600 

0.000531 

0.000433 

0.000432 

0.000410 

0.000393 

0.000390 

0.000380 

0.000370 

0.000359 
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47 


539  kimberly . indelicatoOenron . com 


Kimberly  Indelicato.  0.000333 


B.3  PLSI-U  with  all  Words  (No  Dictionary) 

CATEGORY  0 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  168  COMPONENTS:  1 

LARGEST  COMPONENT  SIZE:  168  PERCENT  OF  TOTAL  GRAPH:  100% 
GROUP  DEGREE:  0.75254  GRAPH  DENSITY:  0.05988 

GROUP  CLOSENESS:  0.34176  GROUP  BETWEENNESS:  0.27509 

AVERAGE  p(z | u) :  0.56  STDEVp(zlu):  0.34 


MOST  PROBABLE  USERS 

Topic#  ID#  Email  Address  Name 

0  253  jeff.dasovichOenron.com .  Jeff  Dasovich 

0  3132  james.wrightOenron.com . 


0  9244  richard.sandersOenron.com . 

0  801  susan.maraOenron.com .  Susan  Mara 

0  1746  scott.stonessOenron.com . 

0  1475  dennis.benevidesOenron.com . 


0  8546  sandra.mccubbinOenron.com .  Sandra  McCubbin 

0  817  richard.shapiroOenron.com .  Richard  Shapiro 

0  1489  james.steffesOenron.com . 

0  2222  harry.kingerskiOenron.com .  Harry  Kingerski 


0  181  paul.kaufmanOenron.com . 

0  3140  marty.sundeOenron.com . 

0  2318  vicki.sharpOenron.com .  Vicki  Sharp.. 

0  1016  neil.bresnanOenron.com .  Neil  Bresnan. 

0  1180  karen.denneOenron.com .  Karen  Denne.. 

0  3152  wanda.curryOenron.com . 

0  7213  mike.smithOenron.com .  Mike  Smith... 

0  1456  dan.leffOenron.com .  Dan  Leff . 

0  8431  skeanOenron.com .  Steve  Kean... 

0  802  gordon.savageOenron.com .  Gordon  Savage 


p(z|u) 

0.014615 

0.010392 

0.008802 

0.008433 

0.008333 

0.007993 

0.007983 

0.007954 

0.007633 

0.007553 

0.007373 

0.007369 

0.007312 

0.007169 

0.007139 

0.006854 

0.006835 

0.006735 

0.006728 

0.006718 


CATEGORY  1 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 
VERTICES:  19 

LARGEST  COMPONENT  SIZE:  17 
GROUP  DEGREE:  0.42810 


COMPONENTS:  2 

PERCENT  OF  TOTAL  GRAPH:  89. 
GRAPH  DENSITY:  0.11111 


47% 
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GROUP  CLOSENESS:  0.07210 


GROUP  BETWEENNESS:  0.50944 


AVERAGE  p(z|u) :  0.37 


STDEV  p(z|u) :  0.38 


MOST  PROBABLE  USERS 


Topic#  ID# 
1  28280 

1  256 

1  46814 

1  47085 

1  14935 

1  2941 

1  20 

1  14 

1  19 

1  12 

1  28809 

1  8 

1  108 

1  152 

1  219 

1  253 

1  15 

1  17 

1  79 

1  48157 


Email  Address 

jbrysonQenron . com . 

pete.davisOenron.com. . . 

jdasovicOenron. com . 

j  f  awcetOenron .com . 

susan.scottOenron.com. . 

sscott30enron . com . 

geir .  solbergOenron.com. 
mark . guzmanOenr on . com . . 
ryan . slingerOenron . com . 
craig.deanOenron.com. . . 

pchoi20enron .com . 

bill . williamsOenron . com 
albert .  meyer sOenr on .  com 
j  ohn .  ander sonOenron .  com 
michael .mierOenron. com. 
jef  f .  dasovichOenron.com 
leaf . haras inOenron . com . 
bert.meyersOenron.com. . 
eric.linderOenron.com.  . 
bgaillarOenron.com . 


Name  p(z|u) 

.  0.026801 

Pete  Davis .  0.008654 

"Jeff  Dasovich  "....  0.008066 


0.005827 

0.005779 

0.004511 


Geir  Solberg .  0.004187 

Mark  Guzman .  0.004186 

Ryan  Slinger .  0.004186 

Craig  Dean .  0.004000 

.  0.003113 

Bill  Williams  III...  0.003057 

.  0.002382 

John  Anderson .  0.002317 

Michael  Mier .  0.002317 

Jeff  Dasovich .  0.001929 

Leaf  Harasin .  0.001854 

Bert  Meyers .  0.001850 

Eric  Linder .  0.001659 

" .  0.001439 


CATEGORY  2 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

AVERAGE  p(z | u) :  0.28  STDEV  p(zlu):  0.26 


MOST  PROBABLE  USERS 


Topic#  ID# 
2  46814 
2  46859 
2  30959 
2  85760 
2  48638 
2  46847 
2  48611 
2  23755 
2  48573 
2  49892 


Email  Address 
jdasovicOenron.com. . . . 

smaraOenr on .com . 

jhartsoOenron . com . 

mhainOect.enron.com. . . 
susan_ j  _maraOenron . com 
mpetrochOenron.com.  .  .  . 
dparqueOect . enr on . com . 
bob.gatesOenron.com. . . 

rboydOenr  on  .com . 

hap_boydOenron.com. . . . 


Name  p(z|u) 

"Jeff  Dasovich  " _  0.011197 

" .  0.007691 

.  0.003643 

" .  0.003588 

.  0.003288 


"Mona  Petrochko  "...  0.002137 

"Dave  Parquet" .  0.002054 

.  0.001577 

"Hap  Boyd" .  0.001487 

"Hap  Boyd  " .  0.001315 
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2 


166  lysa.akin@enron.com 


0.001259 


2 

2 

2 

2 

2 

2 

2 

2 

2 


28868  mhain@enron.com . 

49932  tjohnso8@enron.com . 

46826  jalamo@enron.com .  " . 

10551  hap.boyd@enron.com . 

48738  smccubbi@enron.com . 

52415  johnson.tamara@enron.com . 

48090  jeff_dasovich@enron.com . 

48839  smara@ect.enron.com .  Sue"  "Mara 

48736  hkingers@enron.com . 


0.001066 

0.001005 

0.000912 

0.000824 

0.000725 

0.000659 

0.000655 

0.000610 

0.000573 


***************************************************************************************************************** 
CATEGORY  3 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  652  COMPONENTS:  1 

LARGEST  COMPONENT  SIZE:  652  PERCENT  OF  TOTAL  GRAPH:  1007. 
GROUP  DEGREE:  0.55654  GRAPH  DENSITY:  0.00614 

GROUP  CLOSENESS:  0.24734  GROUP  BETWEENNESS:  0.72785 

AVERAGE  p(z | u) :  0.43  STDEVp(zlu):  0.40 


MOST  PROBABLE  USERS 


Topic#  ID# 
3  3113 

3  11 

3  17807 

3  8379 

3  18882 

3  18855 

3  3079 

3  7528 

3  11391 

3  18901 

3  1589 

3  18877 

3  7514 

3  7029 

3  8497 

3  41477 

3  4637 

3  18926 

3  2365 

3  255 


Email  Address 
sara. shackletonSenron. com. 
monika. causholli@enron. com 
clement . abramsSenr on . com . . 
clint.freeland@enron.com. . 

e . . carter@enron .com . 

david . allan@enr on .com . 

r .  .  conner@enron  .com . 

t .  .  robinsonSenr  on  .com . 

j  ames . bryj  a@enr on .com . 

dirk.dimitry@enron.com. . . . 

bob .  cr  ane@enron  .com . 

greg . bruch@enron .com . 

craig.rickard@enron.com. . . 

jef f . nogid@enron. com . 

joel.ephross@enron.com. . . . 
gar eth . bahlmann@enron . com . 
jim.armogida@enron.com. . . . 
ayesha.kanji@enron.com. . . . 

mary . cook@enron .com . 

angela.davis@enron.com. . . . 


Name 


Monika  Causholli. . . . 

Clement  Abrams . 

Clint  Freeland . 

Karen  E.  Carter . 

David  Allan . 

Andrew  R.  Conner. . . . 
Richard  T.  Robinson. 


Dirk  Dimitry 


Greg  Bruch. . . 
Craig  Rickard 
Jeff  Nogid. . . 
Joel  Ephross. 


Ayesha  Kanji 
Mary  Cook. . . 
Angela  Davis 


p(z|u) 

0.072098 

0.047180 

0.017906 

0.014831 

0.013859 

0.013510 

0.013198 

0.013069 

0.012995 

0.012590 

0.012588 

0.012500 

0.012178 

0.011857 

0.010795 

0.009969 

0.009464 

0.009102 

0.008981 

0.008858 
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CATEGORY  4 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  353  COMPONENTS:  88 

LARGEST  COMPONENT  SIZE:  158  PERCENT  OF  TOTAL  GRAPH:  44.76'/, 
GROUP  DEGREE:  0.20608  GRAPH  DENSITY:  0.01136 

GROUP  CLOSENESS:  0.00041  GROUP  BETWEENNESS:  0.09861 

AVERAGE  p(z | u) :  0.84  STDEVp(zlu):  0.32 


MOST  PROBABLE  USERS 


Topic#  ID# 
4  25048 
4  1005 
4  4911 
4  16327 
4  27797 
4  53483 
4  23696 
4  54051 
4  81838 
4  2317 
4  23690 
4  6061 
4  60973 
4  71045 
4  7611 
4  23703 
4  63242 
4  63248 
4  22614 
4  23699 


Email  Address 

paul . y 5  barbo@enron .com . 

mark. f isherSenron . com . 

hollis.kimbroughaenron.com. . 

dan . master s@enr on .com . 

wayne . perryaenr on .com . 

tony . galt@enron . com . 

mark . walkerSenr on .com . 

jim. f ernieaenron. com . 

tstaab@enron.com . 

greg .  curr  an@enr  on  .com . 

jef f . duff@enron.com . 

mariella.mahan@enron.com. . . . 

kruscit@ect . enron . com . 

mcuilla@ect .  enron  .com . 

rick. sierra@enron . com . 

kurt .  ander son@enron . com . 

miguel . maltes@enr on . com . 

federico.haeussler@enron.com 

pkeavey@enron .com . 

kevin.cousineau@enron.com.  .  . 


Name 


Dan  Masters 


Jim  Fernie . . . 
Theresa  Staab 
Greg  Curran. . 


Kevin . 

Martin  Cuilla 
Rick  Sierra. . 


Miguel  Maltes 


p(z|u) 

0.016515 

0.011929 

0.007286 

0.007236 

0 . 006743 

0.005676 

0.005020 

0.004713 

0.003723 

0.003678 

0.003603 

0.003376 

0.003130 

0.002903 

0.002624 

0.002590 

0.002573 

0.002330 

0.002308 

0.002291 


CATEGORY  5 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 


VERTICES:  519 

LARGEST  COMPONENT  SIZE:  286 
GROUP  DEGREE:  0.25873 
GROUP  CLOSENESS:  0.00054 
AVERAGE  p(z|u) :  0.74 

MOST  PROBABLE  USERS 


COMPONENTS:  7 

PERCENT  OF  TOTAL  GRAPH:  55.11'/, 
GRAPH  DENSITY:  0.00386 
GROUP  BETWEENNESS:  0.15861 
STDEV  p(z|u) :  0.37 


236 


Topic#  ID# 
5  78508 

5  45145 

5  462 

5  417 

5  55961 

5  55981 

5  22612 

5  55972 

5  54135 

5  43980 

5  78538 

5  3392 

5  22786 

5  155 

5  77861 

5  54121 

5  3497 

5  72534 

5  11699 

5  35765 


Email  Address 

khols t@ enron . com . 

jsteffeOenron.com . 

ryan . wattOenron . com . 

stephane . brodeurOenron . com 

f ermisOenron. com . 

sbrewerOenron . com . 

ssouthOenron . com . 

IpriorOenron . com . 

bulletsOenron . com . 

sharrislOenron.com . 

kholstOect . enron .com . 

j  ohnsonOenron .com . 

kimberly . watsonOenron . com . 
elizabeth . sagerOenron . com . 

woodOenron .com . 

market . teamOenron . com . 

stuart . rexrodeOenron . com . . 

tomlinsonOenron . com . 

evansOenron . com . 

1 . . j  ohnsonOenron .com . 


Name 


Ryan  Watt . 

Stephane  Brodeur. . . . 


Kimberly  Watson 
Elizabeth  Sager 


David  L.  Johnson. . . . 


p(z|u) 

0.009242 

0.007759 

0.005368 

0.005330 

0.005200 

0.004987 

0.004691 

0.004691 

0.003601 

0.002856 

0.002797 

0.001948 

0.001907 

0.001861 

0.001642 

0.001592 

0.001385 

0.001373 

0.001338 

0.001333 


******************************************************************************************************************* 
CATEGORY  6 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  327  COMPONENTS:  5 

LARGEST  COMPONENT  SIZE:  239  PERCENT  OF  TOTAL  GRAPH:  73. 09°/. 
GROUP  DEGREE:  0.29613  GRAPH  DENSITY:  0.00920 

GROUP  CLOSENESS:  0.00213  GROUP  BETWEENNESS:  0.30785 
AVERAGE  p(z | u) :  0.48  STDEVp(zlu):  0.42 


MOST  PROBABLE  USERS 

Topic#  ID#  Email  Address  Name 

6  1454  j . .kean@enron. com .  Steven  J.  Kean 

6  2390  brent.hendry@enron.com .  Brent  Hendry.. 

6  3113  sara.shackleton@enron.com . 

6  1179  .palmer@enron.com .  pr . 

6  1180  karen.denne@enron.com .  Karen  Denne... 

6  20382  lynn.aven@enron.com . 

6  741  jenny.rub@enron.com .  Jenny  Rub . 

6  3450  bill.donovan@enron.com . 

6  8519  steve.hotte@enron.com .  Steve  Hotte... 

6  20029  andrea.calo@enron.com . 


p(z|u) 

0.134864 

0.055713 

0.036853 

0.036653 

0.019328 

0.018104 

0.015713 

0.015673 

0.013934 

0.011996 
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6 


1634  bruce.harrisOenron.com 


0.011846 


6 

2439 

j  ohn . br indle@enron . com . 

. . .  0.011136 

6 

1463 

maureen . mcvicker@enron . com . 

. . .  0.010873 

6 

711 

bob .mcaulif f e@enron. com . 

.  Bob  McAuliffe . 

. . .  0.008630 

6 

56819 

andrea . bertone@enron . com . 

. . .  0.008554 

6 

17612 

scott . abshire@enron . com . 

.  Scott 

Abshire . 

. . .  0.008536 

6 

671 

keith . dziadek@enron . com . 

.  Keith 

Dziadek . 

. . .  0.008170 

6 

17563 

barton . clark@enron . com . 

. . .  0.007417 

6 

8420 

randy . petersen@enron . com . 

.  Randy 

Petersen. . . . 

. . .  0.007152 

6 

9039 

theresa . brogan@enron . com . 

. . .  0.007129 

***************************************************************************************************************** 
CATEGORY  7 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  687  COMPONENTS:  1 

LARGEST  COMPONENT  SIZE:  687  PERCENT  OF  TOTAL  GRAPH:  1007, 
GROUP  DEGREE:  0.38592  GRAPH  DENSITY:  0.01020 

GROUP  CLOSENESS:  0.16711  GROUP  BETWEENNESS:  0.54800 

AVERAGE  p(z | u) :  0.39  STDEV  p (z I u) :  0.36 


MOST  PROBABLE  USERS 


Topic#  ID# 
7  1612 

7  602 

7  757 

7  550 

7  549 

7  645 

7  691 

7  514 

7  344 

7  8931 

7  11116 

7  1777 

7  593 

7  327 

7  732 

7  1400 

7  632 

7  637 

7  333 

7  14674 


Email  Address 

j . . f  armer@ enron .com . 

robert . superty@enron . com . 
patti . sullivan® enron . com . 
victor . lamadrid@enron . com 
lisa. kinsey@enron. com. . . . 
bryce .baxter@enron. com. . . 
tammy . jaquet@enron. com. . . 
Clarissa. garcia@enron. com 

m . . smith@enron . com . 

kevin . heal@enron .com . 

megan . parker@enron . com. . . 

rita . wynne@enron .com . 

1 . . schrab@enron . com . 

matt . pena@enron . com . 

richard . pinion@enron . com . 
donna.greif@enron.com. . . . 
sherry . anastas@enron . com . 
natalie . baker@enron . com . . 

ramesh . rao@enron .com . 

s . . olinger@enron .com . 


Name  p(z|u) 

.  0.023699 

Robert  Superty .  0.018165 

Patti  Sullivan .  0.015374 

Victor  Lamadrid .  0.014556 

Lisa  Kinsey .  0.013849 

Bryce  Baxter .  0.011966 

Tammy  Jaquet .  0.010387 

Clarissa  Garcia .  0.009664 

Regan  M.  Smith .  0.008862 

Kevin  Heal .  0.008748 

Megan  Parker .  0.007795 

.  0.007718 

Mark  L.  Schrab .  0.007635 

Matt  Pena .  0.007385 

Richard  Pinion .  0.007278 

Donna  Greif .  0.006221 

Sherry  Anastas .  0.006147 

Natalie  Baker .  0.006055 

Ramesh  Rao .  0.006026 


Kimberly  S.  Olinger.  0.005569 
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CATEGORY  8 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 
VERTICES:  309 

LARGEST  COMPONENT  SIZE:  307 
GROUP  DEGREE:  0.36347 
GROUP  CLOSENESS:  0.05294 
AVERAGE  p(z|u) :  0.37 


COMPONENTS:  2 

PERCENT  OF  TOTAL  GRAPH:  99. 
GRAPH  DENSITY:  0.01948 
GROUP  BETWEENNESS:  0.31571 
STDEV  p(zlu) :  0.38 


35% 


MOST  PROBABLE  USERS 


Topic#  ID# 
8  288 

8  355 

8  2365 

8  5897 

8  7158 

8  437 

8  1019 

8  280 

8  436 

8  1092 

8  284 

8  5889 

8  1091 

8  3113 

8  2390 

8  294 

8  401 

8  8306 

8  423 

8  503 


Email  Address 

tana. jones@enron. com . 

. taylor@enron .com . 

mary . cook@enron .com . 

mark. taylorSenron . com . 

mark.greenberg@enron.com. . . 
peter.keohane@enron.com. . . . 
leslie.hansen@enron.com. . . . 

marie . heard@enron . com . 

greg.johnston@enron.com. . . . 
travis .mccullough@enron . com 

t . . hodge@enron .com . 

frank . sayreSenron . com . 

car ol . st . Senron .com . 

sara. shackletonSenron. com. . 

brent . hendry@enron .com . 

c . . koehler@enron .com . 

bob . shults@enron . com . 

n . . gray@enron .com . 

Sharon . cr  awf  ordSenron . com . . 
daniel.diamond@enron.com. . . 


Name 

Tana  Jones 

legal . 

Mary  Cook. 


Mark  Greenberg . 

Peter  Keohane . 

Leslie  Hansen . 

Marie  Heard . 

Greg  Johnston . 

Travis  McCullough. . . 
Jeffrey  T.  Hodge. . . . 


Carol  St .  Clair 


Brent  Hendry . . . 
Anne  C.  Koehler 

Bob  Shults . 

Barbara  N .  Gray 
Sharon  Crawford 
Daniel  Diamond. 


p(z|u) 

0.129389 

0.064858 

0.039520 

0.033920 

0.032178 

0.027741 

0.022414 

0.019159 

0.017884 

0.015881 

0.015827 

0.014296 

0.014112 

0.013717 

0.013341 

0.012106 

0.011187 

0.010455 

0.009768 

0.009520 


CATEGORY  9 

EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  2146  COMPONENTS:  2 

LARGEST  COMPONENT  SIZE:  2137  PERCENT  OF  TOTAL  GRAPH:  99.58% 
GROUP  DEGREE:  0.19058  GRAPH  DENSITY:  0.00186 

GROUP  CLOSENESS:  0.00847  GROUP  BETWEENNESS:  0.32903 

AVERAGE  p(z | u) :  0.29  STDEV  p(zlu):  0.34 

MOST  PROBABLE  USERS 
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Topic#  ID# 
9  2155 

9  2209 

9  5569 

9  253 

9  2185 

9  47920 

9  2156 

9  6112 

9  2186 

9  2784 

9  23995 

9  2184 

9  5018 

9  2187 

9  9488 

9  28735 

9  2237 

9  2312 

9  5034 

9  1463 


Email  Address  Name 

lara.leibman@enron.com .  Lara  Leibman.  . 

sue.nord@enron.com .  Sue  Nord . 

donald . lassere@enron . com . 

jeff.dasovich@enron.com .  Jeff  Dasovich. 

michelle.hicks@enron.com .  Michelle  Hicks 

mike . dahlke@enron . com . 

ginger.dernehl@enron.com .  Ginger  Dernehl 

j  ane . wilson@enron . com . 

robbi.rossi@enron.com .  Robbi  Rossi... 

ron . mcnamara@enron . com . 

j  ames . ginty@enron . com . 


cynthia.harkness@enron.com .  Cynthia  Harkness.... 


sylvia . hu@enron . com . 

wayne .  gardner@enron .  com .  Wayne  Gardner . 

hardie . davis@enron . com . 

xi . xi@enron . com . 

geriann.warner@enron.com .  Geriann  Warner 

jan.haizmann@enron.com .  Jan  Haizmann.  . 

marcia . linton@enron . com . 


maureen . mcvicker@enron . com 


p(z|u) 

0.007156 

0.005958 

0.005814 

0.005405 

0.004549 

0.004313 

0.003892 

0.003818 

0.003701 

0.003684 

0.003598 

0.003557 

0.003521 

0.003429 

0.003417 

0.003387 

0.003285 

0.003271 

0.003265 

0.003241 


******************************************************************************************************************* 
CATEGORY  10 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  689  COMPONENTS:  1 

LARGEST  COMPONENT  SIZE:  689  PERCENT  OF  TOTAL  GRAPH:  100% 

GROUP  DEGREE:  0.36647  GRAPH  DENSITY:  0.00727 

GROUP  CLOSENESS:  0.19886  GROUP  BETWEENNESS:  0.42712 

AVERAGE  p(z | u) :  0.53  STDEVp(zlu):  0.41 

p(z|u) 
0.089934 
0.064229 
0.053066 
0.030331 
0.029677 
0.014505 
0.013025 
0.011683 
0.011220 
0.010721 


MOST  PROBABLE  USERS 

Topic#  ID#  Email  Address  Name 

10  1637  rod.hayslettSenron.com . 

10  18299  tracy.geacconeSenron.com .  Tracy  Geaccone. 

10  4769  stanley.hortonSenron.com . 

10  7573  james.saundersSenron.com .  James  Saunders. 

10  5480  danny.mccartySenron.com . 

10  15310  bob.chandlerSenron.com . 

10  1647  a.  .howardSenron.  com . 

10  2280  shelley.cormanSenron.com .  Shelley  Corman. 

10  18986  john.cobbSenron.com . 

10  4140  cindy.starkSenron.com . 
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10  18987  jerry.petersOenron.com 


0.009553 


10 

10 

10 

10 

10 

10 

10 

10 

10 


5490  julie.armstrongOenron.com.. 
17653  james.centilliOenron.com... 
6084  morris.brassfieldOenron.com 

5489  kathy.camposOenron.com . 

8308  steven.harrisOenron.com.... 
7953  dave.neubauerOenron.com. . . . 

11386  dan.fanclerOenron.com . 

31667  steve.gilbertOenron.com. . . . 
8643  john.keiserOenron.com . 


.  0.008680 

James  Centilli .  0.008609 

.  0.008229 

.  0.008104 

Steven  Harris .  0.007407 

Dave  Neubauer .  0.007347 

.  0.007161 

.  0.006977 

John  Keiser .  0.006889 


***************************************************************************************************************** 
CATEGORY  11 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  606  COMPONENTS:  3 

LARGEST  COMPONENT  SIZE:  591  PERCENT  OF  TOTAL  GRAPH:  97.52"/, 
GROUP  DEGREE:  0.36260  GRAPH  DENSITY:  0.02149 

GROUP  CLOSENESS:  0.00466  GROUP  BETWEENNESS:  0.20774 

AVERAGE  p(z | u) :  0.39  STDEVp(zlu):  0.38 


MOST  PROBABLE  USERS 


Topic#  ID# 
11  4638 
11  2201 
11  1461 
11  3443 
11  1477 
11  1490 
11  3538 
11  3444 
11  3441 
11  1452 
11  222 
11  1543 
11  1544 
11  3484 
11  1463 
11  1724 
11  6156 
11  4058 
11  4130 
11  2345 


Email  Address 
james.derrickOenron.com. . . 

cindy .  olsonOenron .  com . 

kay . chapmanOenr on .com . 

mark . koenigOenr on .com . 

greg.whalleyOenron.com. . . . 

Steven . keanOenr on .com . 

mark.frevert0enron.com. . . . 
jeffrey . mcmahonOenron . com. 

kenneth . layOenr on .com . 

david.delaineyOenron.com. . 

david . oxleyOenr on .com . 

richard.causeyOenron.com. . 

rick.  buyOenron.  com . 

raymond.bowenOenron.com. . . 
maureen.mcvickerOenron. com 
paula.riekerOenron.com. . . . 

j  .harrisOenron.  com . 

jef  f  .  skillingOenron.  com .  .  . 
kevin.hannonOenron.com.  .  .  . 
liz . taylorOenron .com . 


Name 


Cindy  Olson 


Mark  Frevert 


David  Delainey 
David  Oxley. . . 


Liz  Taylor 


p(z|u) 

0.036013 

0.019847 

0.016723 

0.016652 

0.016576 

0.015718 

0.015387 

0.014574 

0.012490 

0.011898 

0.011356 

0.011350 

0.011146 

0.010597 

0.010338 

0.010201 

0.010005 

0.009845 

0.009549 

0.009328 
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CATEGORY  12 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  376  COMPONENTS:  9 

LARGEST  COMPONENT  SIZE:  342  PERCENT  OF  TOTAL  GRAPH:  90.96'/. 
GROUP  DEGREE:  0.31566  GRAPH  DENSITY:  0.01333 

GROUP  CLOSENESS:  0.00322  GROUP  BETWEENNESS:  0.54592 

AVERAGE  p(z | u) :  0.45  STDEVp(zlu):  0.40 


MOST  PROBABLE  USERS 


Topic#  ID# 
12  5897 
12  403 
12  21042 
12  3100 
12  20030 
12  3114 
12  22321 
12  1698 
12  404 
12  5864 
12  573 
12  31733 
12  20015 
12  20022 
12  18391 
12  3046 
12  3475 
12  18402 
12  6098 
12  4786 


Email  Address 

mark. taylorQenron . com . 

david.forsterOenron.com . 

justin .boydOenron . com . 

alan.aronowitzOenron.com. . . . 

david . minnsOenr on .com . 

paul . simonsOenr on .com . 

edmund . cooperOenron . com . 

susan .  muschOenr  on  .com . 

rahil . jaf ryOenron . com . 

janine.jugginsOenron.com. . . . 

dale  .  neunerOenr  on  .com . 

jeff.blumenthalOenron.com. . . 

john. viveritoOenron. com . 

j  ane . mcbrideOenron .com . 

mark . evansOenron .com . 

j  onathan . whiteheadOenron . com 
bryan.seyfriedOenron.com. . . . 
stephen.douglasOenron.com. . . 
debbie.brackettOenron.com. . . 
dave . samuelsOenron. com . 


Name 


Justin  Boyd 


Dale  Neuner .... 
Jeff  Blumenthal 


Jonathan  Whitehead. . 


Stephen  H  Douglas. . . 


p(z|u) 

0.202823 

0.045717 

0.036726 

0.025537 

0.024279 

0.021258 

0.019739 

0.016082 

0.014950 

0.013825 

0.012468 

0.011894 

0.011029 

0.010295 

0.010200 

0.009499 

0.009468 

0.009409 

0.009278 

0.009147 


CATEGORY  13 

EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  1123  COMPONENTS:  1 

LARGEST  COMPONENT  SIZE:  1123  PERCENT  OF  TOTAL  GRAPH:  1007, 
GROUP  DEGREE:  0.45866  GRAPH  DENSITY:  0.00535 

GROUP  CLOSENESS:  0.21732  GROUP  BETWEENNESS:  0.60890 

AVERAGE  p(z | u) :  0.30  STDEVp(zlu):  0.35 

MOST  PROBABLE  USERS 
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Topic#  ID# 

13 

3041 

13 

403 

13 

1718 

13 

4134 

13 

401 

13 

3032 

13 

2359 

13 

1200 

13 

1462 

13 

1687 

13 

227 

13 

293 

13 

595 

13 

1752 

13 

1695 

13 

3538 

13 

407 

13 

3476 

13 

577 

13 

4785 

Email  Address  Name 

george.mcclellan@enron.com .  George  Mcclellan.  .  .  . 

david.forster@enron.com . 

daniel . reck@enron . com . 

j  ef f rey . shankman@enron . com . 


bob.shults@enron.com .  Bob  Shults . 

sheri.thomas@enron.com .  Sheri  Thomas... 

andy.zipper@enron.com .  Andy  Zipper.... 

savita.puthigai@enron.com .  Savita  Puthigai 


kimberly . hillis@enron . com 


kevin . mcgowan@enron . com . 

sally.beck@enron.com .  Sally  Beck.... 

louise.kitchen@enron.com .  Louise  Kitchen 

kal .  shah@enron .  com .  Kal  Shah . 

mark . tawney@enron . com . 

torrey . moorer@enron . com . 

mark.frevert@enron.com .  Mark  Frevert.. 

j  ennif er . denny@enron . com . 

stuart . staley@enron . com . 


leonardo.pacheco@enron.com .  Leonardo  Pacheco.... 

j  ohn . nowlan@enron . com . 


p(z|u) 

0.016337 

0.010709 

0.009831 

0.009321 

0.009060 

0.009008 

0.008698 

0.008645 

0.008540 

0.007943 

0.007812 

0.007649 

0.007441 

0.007295 

0.007059 

0.006945 

0.006601 

0.006433 

0.006428 

0.005572 


******************************************************************************************************************* 
CATEGORY  14 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  949  COMPONENTS:  2 

LARGEST  COMPONENT  SIZE:  947  PERCENT  OF  TOTAL  GRAPH:  99. 79°/. 
GROUP  DEGREE:  0.21779  GRAPH  DENSITY:  0.01160 

GROUP  CLOSENESS:  0.04811  GROUP  BETWEENNESS:  0.16829 

AVERAGE  p(z | u) :  0.78  STDEVp(zlu):  0.34 


MOST  PROBABLE  USERS 


Topic#  ID# 

14 

12134 

14 

8308 

14 

8303 

14 

22786 

14 

14935 

14 

24943 

14 

9221 

14 

23672 

14 

24866 

14 

21117 

Email  Address 

kevin . hyatt@enron . com . 

steven.harris@enron.com. . . . 

drew . f ossum@enron . com . 

kimberly.watson@enron.com. . 

susan . scott@enron . com . 

michelle.lokay@enron.com. . . 
lorraine . lindberg@enron . com 
jeffery.fawcett@enron.com. . 

lindy . donoho@enron . com . 

tk . lohman@enron . com . 


Name 

Kevin  Hyatt .... 
Steven  Harris . . 
Drew  Fossum. . . . 
Kimberly  Watson 


TK  L ohm an 


p(z|u) 

0.027850 

0.025017 

0.023359 

0.018504 

0.016858 

0.016388 

0.016099 

0.014024 

0.013937 

0.012491 
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14 


2810  larry.campbell@enron.com 


0.010799 


14 

14 

14 

14 

14 

14 

14 

14 

14 


24902  glen.hass@enron.com . 

39747  mary.miller@enron.com . 

43927  rich.jolly@enron.com . 

8320  louis.soldano@enron.com .  Louis  Soldano  . 

2280  shelley.corman@enron.com .  Shelley  Corman 

20524  lorna.brennan@enron.com . 

43926  maria.pavlou@enron.com . 

7953  dave.neubauer@enron.com .  Dave  Neubauer. 

8476  john.shafer@enron.com . 


0.009459 

0.008694 

0.008478 

0.007676 

0.007379 

0.007022 

0.006968 

0.006550 

0.005879 


***************************************************************************************************************** 
CATEGORY  15 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  232  COMPONENTS:  14 

LARGEST  COMPONENT  SIZE:  180  PERCENT  OF  TOTAL  GRAPH:  77.59"/, 
GROUP  DEGREE:  0.21397  GRAPH  DENSITY:  0.02165 

GROUP  CLOSENESS:  0.00348  GROUP  BETWEENNESS:  0.23424 

AVERAGE  p(z | u) :  0.50  STDEVp(zlu):  0.42 


MOST  PROBABLE  USERS 


Topic#  ID#  Email  Address  Name 

15  680  mike.grigsby@enron.com .  Mike  Grigsby . 

15  10241  phillip.allen@enron.com . 

15  629  k.  .  allen@enron.  com .  Phillip  K.  Allen.... 

15  761  m.  .  tholt@enron.  com .  Jane  M.  Tholt . 

15  3659  scott.tholan@enron.com . 

15  516  chris.gaskill@enron.com .  Chris  Gaskill . 


15  1721  claudio.ribeiro@enron.com 

15  80  john.zufferli@enron.com.. 


15  687  keith.holst@enron.com .  Keith  Holst.... 

15  56  tim.heizenrader@enron.com .  Tim  Heizenrader 

15  2553  tori.kuykendall@enron.com .  Tori  Kuykendall 

15  2356  kristin.walsh@enron.com .  Kristin  Walsh.. 

15  673  frank.ermis@enron.com .  Frank  Ermis  .  .  .  . 

15  737  jay.reitmeyer@enron.com .  Jay  Reitmeyer.. 

15  1673  james.lewis@enron.com . 

15  445  rob.milnthorp@enron.com .  Rob  Milnthorp.. 

15  14875  jeffrey.snyder@enron.com . 

15  2575  joe.parks@enron.com .  Joe  Parks . 

15  703  matthew.lenhart@enron.com .  Matthew  Lenhart 

15  675  1 .  .  gay@enron.  com .  Randall  L.  Gay. 


p(z|u) 

0.100385 

0.056589 

0.041173 

0.034956 

0.031703 

0.030711 

0.026102 

0.023783 

0.023231 

0.020200 

0.018342 

0.018124 

0.017959 

0.015372 

0.013522 

0.011750 

0.010901 

0.009810 

0.009466 

0.009367 
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CATEGORY  16 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  2426  COMPONENTS:  3 

LARGEST  COMPONENT  SIZE:  2413  PERCENT  OF  TOTAL  GRAPH:  99.46'/, 
GROUP  DEGREE:  0.36036  GRAPH  DENSITY:  0.00289 

GROUP  CLOSENESS:  0.00543  GROUP  BETWEENNESS:  0.29942 

AVERAGE  p(z | u) :  0.33  STDEVp(zlu):  0.34 


MOST  PROBABLE  USERS 


Topic#  ID# 

16 

6909 

16 

1832 

16 

11182 

16 

1874 

16 

2032 

16 

1130 

16 

11262 

16 

11172 

16 

6702 

16 

1586 

16 

7106 

16 

6657 

16 

1758 

16 

8849 

16 

1878 

16 

7614 

16 

11 

16 

11227 

16 

7583 

16 

11402 

Email  Address  Name 

erin.willis@enron.com .  Erin  Willis . 

jana.giovannini@enron.com .  Jana  Giovannini . 

mar cus . edmonds@enr on .com . 

jody.crook@enron.com .  Jody  Crook . 

rahul.seksaria@enron.com .  Rahul  Seksaria . 

joseph.piotrowski@enron.com .  Joseph  Piotrowski .  .  . 

j  ames . wininger@enr on .com . 

Christopher . chenoweth@enron. com . 

chad.landry@enron.com .  Chad  Landry . 

mark . courtney@enr on . com . 

george.thomas@enron.com .  George  Thomas . 

anthony.sexton@enron.com .  Anthony  Sexton . 

carl . tricoli@enron.com . 

robin.rodrigue@enron.com .  Robin  Rodrigue . 

bryan.hull@enron.com .  Bryan  Hull . 

michael.simmons@enron.com .  Michael  Simmons . 

monika.causholli@enron.com .  Monika  Causholli .  .  .  . 

ravi ,mujumdar@enron. com . 

ethan.schultz@enron.com .  Ethan  Schultz . 

ron .bertasi@enron . com . 


p(z|u) 

0.002479 

0.002378 

0.002111 

0.001996 

0.001984 

0.001974 

0.001927 

0.001856 

0.001826 

0.001810 

0.001801 

0.001784 

0.001778 

0.001772 

0.001771 

0.001771 

0.001759 

0.001739 

0.001714 

0.001713 


CATEGORY  17 

EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  336  COMPONENTS:  1 

LARGEST  COMPONENT  SIZE:  336  PERCENT  OF  TOTAL  GRAPH:  100'/, 
GROUP  DEGREE:  0.50323  GRAPH  DENSITY:  0.02090 

GROUP  CLOSENESS:  0.24403  GROUP  BETWEENNESS:  0.36570 

AVERAGE  p(z | u) :  0.36  STDEVp(zlu):  0.36 

MOST  PROBABLE  USERS 
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Topic#  ID# 
17  584 

17  601 

17  607 

17  499 

17  519 

17  1642 

17  1990 

17  650 

17  618 

17  226 

17  477 

17  538 

17  1126 

17  1103 

17  1105 

17  812 

17  314 

17  257 

17  1111 

17  1148 


Email  Address 

m. .presto@enron.com . 

j . . sturm@enron . com . 

d . . thomas@enron . com . 

dana.davis@enron.com . 

doug . gilbert-smith@enron . com 
rogers.herndon@enron.com. . . . 

harry . arora@enron . com . 

j  ae . black@enron . com . 

lloyd . will@enron .com . 

don . baughman@enron . com . 

robert . benson@enron . com . 

r ika . imai@enron . com . 

tom .may@enron. com . 

d . . baughman@enron . com . 

j . .broderick@enron.com . 

1. . nicolay@enron.com . 

jeffrey.miller@enron.com. . . . 

1 . .  day@enron . com . 

f . . campbell@enron . com . 

gautam . gupta@enron .com . 


Name  p(z|u) 

Kevin  M.  Presto .  0.072435 

Fletcher  J.  Sturm...  0.026405 

Paul  D.  Thomas .  0.026170 

Mark  Dana  Davis .  0.025981 

Doug  Gilbert-smith..  0.025086 

.  0.019672 

Harry  Arora .  0.019444 

Tamara  Jae  Black....  0.016925 

Lloyd  Will .  0.015534 

Don  Baughman  Jr .  0.012883 

Robert  Benson .  0.012035 

Rika  Imai .  0.011523 

Tom  May .  0.010276 

Edward  D.  Baughman..  0.009412 
Paul  J.  Broderick...  0.009133 

.  0.008751 

Jeffrey  Miller .  0.008388 

Smith  L.  Day .  0.008188 

Larry  F.  Campbell...  0.007795 
Gautam  Gupta .  0.007759 


******************************************************************************************************************* 
CATEGORY  18 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  2284  COMPONENTS:  5 

LARGEST  COMPONENT  SIZE:  2252  PERCENT  OF  TOTAL  GRAPH:  98.60"/, 

GROUP  DEGREE:  0.35760  GRAPH  DENSITY:  0.00088 

GROUP  CLOSENESS:  0.00165  GROUP  BETWEENNESS:  0.67926 

AVERAGE  p(z | u) :  0.71  STDEVp(zlu):  0.41 

p(z|u) 
0.006060 
0.005111 
0.004360 
0.003620 
0.003619 
0.003591 
0.003547 
0.003233 
0.003123 
0.003113 


MOST  PROBABLE  USERS 

Topic#  ID#  Email  Address  Name 

18  911  .johnaenron.com .  e-mail 

18  3688  .davidSenron.com .  e-mail 

18  2410  .mikeaenron.com .  e-mail 

18  3719  .michaeiaenron.com .  e-mail 

18  907  .jeffaenron.com .  e-mail 

18  939  .scottaenron.com .  e-mail 

18  2431  .tomaenron.com .  e-mail 

18  883  .danaenron.com .  e-mail 

18  3715  .markaenron.com .  e-mail 

18  875  .bobaenron.com .  e-mail 
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18 


889  .eric@enron.com 


e-mail 


0.003055 


18  874  .bill@enron.com .  e-mail 

18  3228  .jim@enron.com .  e-mail 

18  913  .kevin@enron.com .  e-mail 

18  3229  .joe@enron.com .  e-mail 

18  947  .steve@enron.com .  e-mail 

18  2130  .chris@enron.com .  e-mail 

18  4058  jeff.skilling@enron.com . 

18  933  .robert@enron.com .  e-mail 

18  895  .gary@enron.com .  e-mail 


0.002984 
0.002980 
0.002876 
0.002865 
0.002734 
0.002645 
0 . 002447 
0.002406 
0.002283 


***************************************************************************************************************** 
CATEGORY  19 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 
VERTICES:  525 

LARGEST  COMPONENT  SIZE:  505 
GROUP  DEGREE:  0.36501 
GROUP  CLOSENESS:  0.00377 
AVERAGE  p(z|u) :  0.77 


COMPONENTS:  4 

PERCENT  OF  TOTAL  GRAPH:  96. 
GRAPH  DENSITY:  0.02290 
GROUP  BETWEENNESS:  0.33710 
STDEV  p(z|u) :  0.37 


19"/. 


MOST  PROBABLE  USERS 


Topic#  ID# 

19 

43960 

19 

2530 

19 

23333 

19 

22786 

19 

2280 

19 

2575 

19 

6966 

19 

44099 

19 

9096 

19 

24943 

19 

8528 

19 

25049 

19 

3005 

19 

53745 

19 

521 

19 

22593 

19 

23666 

19 

21117 

19 

466 

19 

19794 

Email  Address 

lynn.  blair@enron.  com . 

Chris . germany@enron. com . 

darrell . schoolcraft@enron.com 
kimberly.watson@enron.com. . . . 

Shelley . cormanSenr on .com . 

j  oe . parks@enron .com . 

j  ohn .  buchanan@enr  on .  com . 

terry . kowalke@enron . com . 

rick. dietz@enron. com . 

michelle . lokaySenron . com . 

mark.mcconnell@enron . com . 

r aetta . zadow@enron .com . 

joann. collins@enron. com . 

Steve . j  anuary@enr on . com . 

scott .  goodell@enron.com . 

j  oan . veselack@enr on . com . 

sheila. nacey@enron. com . 

tk . lohman@enron .com . 

robert . allwein@enr on .com . 

ramona.betancourt@enron.com. . 


Name 


Chris  Germany 


Kimberly  Watson 
Shelley  Corman. 

Joe  Parks . 

John  Buchanan. . 


Joann  Collins 
Steve  January 
Scott  Goodell 


TK  Lohman . 

Robert  Allwein 


p(z|u) 

0.042597 

0.037474 

0.020053 

0.019329 

0.018253 

0.015446 

0.014915 

0.013305 

0.012458 

0.010309 

0.008525 

0.007601 

0.007235 

0.006966 

0.006878 

0.006711 

0.006084 

0.005919 

0.005870 

0.005386 
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CATEGORY  20 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  76  COMPONENTS:  8 

LARGEST  COMPONENT  SIZE:  38  PERCENT  OF  TOTAL  GRAPH:  50.00'/, 
GROUP  DEGREE:  0.17658  GRAPH  DENSITY:  0.04000 

GROUP  CLOSENESS:  0.00547  GROUP  BETWEENNESS:  0.15440 

AVERAGE  p(z | u) :  0.85  STDEVp(zlu):  0.32 


MOST  PROBABLE  USERS 


Topic#  ID# 

20 

253 

20 

34417 

20 

734 

20 

15669 

20 

46814 

20 

2592 

20 

4493 

20 

41627 

20 

41601 

20 

48090 

20 

77707 

20 

2347 

20 

14935 

20 

548 

20 

2453 

20 

41521 

20 

60089 

20 

2515 

20 

2587 

20 

69405 

Email  Address 

j  ef  f  .  dasovich@enr on .  com . 

alewisSect . enron.com . 

dutch. quigley@enron. com . 

sscott5@enron . com . 

jdasovic@enron. com . 

m. .scott@enron.com . 

lcampbel@enron.com . 

scorman@enron . com . 

r shapiro@enr on .com . 

jef  f  _dasovich@enron.  com . 

plucci@enron .  com . 

h. .lewis@enron.com . 

susan. scottSenron . com . 

j  ef  f  .  king@enron  .com . 

undisclosed-recipients@enron.com 

brapp@enr on .com . 

shendri@ect . enron . com . 

martin. cuilla@enron. com . 

kevin . ruscitti@enron . com . 

khyatt@enron. com . 


Name 

Jeff  Dasovich 
Andrew  Lewis. 
Dutch  Quigley 


"Jeff  Dasovich  " . . . . 
Susan  M.  Scott . 


Paul  Lucci . 

Andrew  H.  Lewis 


Jeff  King . 

undisclosed-recipien 


scott  hendrickson. . . 

Martin  Cuilla . 

Kevin  Ruscitti . 


p(z|u) 

0.051477 

0.032326 

0.020997 

0.016981 

0.015975 

0.012083 

0.010704 

0.010053 

0.009135 

0.008397 

0.007649 

0.006508 

0.005442 

0.005317 

0.004478 

0.004313 

0.004136 

0.004060 

0.004009 

0.003721 


CATEGORY  21 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 
VERTICES:  380 

LARGEST  COMPONENT  SIZE:  380 
GROUP  DEGREE:  0.38471 
GROUP  CLOSENESS:  0.23608 
AVERAGE  p(z|u) :  0.32 


COMPONENTS:  1 

PERCENT  OF  TOTAL  GRAPH:  1007, 
GRAPH  DENSITY:  0.01319 
GROUP  BETWEENNESS:  0.45623 
STDEV  p(zlu) :  0.35 


MOST  PROBABLE  USERS 
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Topic#  ID# 
21  3111 
21  2530 
21  15273 
21  6815 
21  1688 
21  2514 
21  2575 
21  20018 
21  3540 
21  14717 
21  1399 
21  14697 
21  1230 
21  8756 
21  17250 
21  2497 
21  2357 
21  1100 
21  11389 
21  1093 


Email  Address 

gerald . nemecOenron . com . 

chris.germany@enron.com. . . . 

dan . hyvl@enron . com . 

debra . perlingiere@enron . com 

ed . mcmichael@enron . com . 

ruth.concannon@enron.com. . . 

j  oe . parks@enron . com . 

stuart.zisman@enron.com. . . . 

maria . garza@enron . com . 

eric.gillaspie@enron.com. . . 

eric . boyt@enron . com . 

barbara . gray@enron . com . 

phil . polsky@enron . com . 

margaret.dhont@enron.com. . . 

Steve . hooser@enron . com . 

robin . barbe@enron . com . 

brian.redmond@enron.com. . . . 
russell . diamond@enron . com . . 

garrick . hill@enron . com . 

kay . mann@enron . com . 


Name 


Chris  Germany . 

Dan  J  Hyvl . 

Debra  Perlingiere. . . 


Ruth  Concannon 
Joe  Parks . 


Maria  Garza. . . 
Eric  Gillaspie 

Eric  Boyt . 

Barbara  Gray. . 
Phil  Polsky. . . 
Margaret  Dhont 


Robin  Barbe .... 
Brian  Redmond. . 
Russell  Diamond 


Kay  Mann 


p(z|u) 

0.131946 

0 . 047453 

0.043104 

0.042656 

0.031111 

0.020229 

0.019008 

0.018083 

0.015160 

0.014902 

0.013725 

0.012778 

0.012274 

0.010542 

0.010413 

0.010170 

0.009285 

0.008737 

0.008569 

0.008517 


******************************************************************************************************************* 
CATEGORY  22 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  651  COMPONENTS:  2 

LARGEST  COMPONENT  SIZE:  649  PERCENT  OF  TOTAL  GRAPH:  99 . 69°/. 
GROUP  DEGREE:  0.77675  GRAPH  DENSITY:  0.00462 

GROUP  CLOSENESS:  0.07842  GROUP  BETWEENNESS:  0.73914 

AVERAGE  p(z | u) :  0.31  STDEVp(zlu):  0.38 


MOST  PROBABLE  USERS 


Topic#  ID# 

22 

3644 

22 

14787 

22 

213 

22 

1100 

22 

20487 

22 

654 

22 

14665 

22 

480 

22 

2573 

22 

1102 

Email  Address 

kim . ward@enron . com . 

jason.williams@enron.com. . . 

chris .foster@enron.com . 

russell . diamond@enron . com . . 

recipients@enron . com . 

craig.breslau@enron.com. . . . 
veronica . espinoza@enron . com 

bob . bowen@enron . com . 

lucy . ortiz@enron . com . 

tom.moran@enron. com . 


Name  p(z|u) 

Kim  Ward .  0.051340 

Jason  Williams .  0.027822 

Chris  H  Foster .  0.022375 

Russell  Diamond .  0.014306 

.  0.011722 

Craig  Breslau .  0.009517 

Veronica  Espinoza...  0.009159 

Bob  Bowen .  0.007059 

Lucy  Ortiz .  0.006014 

Tom  Moran .  0.005816 


249 


22 


4854  william.bradford@enron.com 


0.005410 


22  6098  debbie.brackett@enron.com . 

22  14716  linda.ewing@enron.com .  Linda  Ewing . 

22  1668  fred.lagrasta@enron.com . 

22  206  christian.yoder@enron.com . 

22  3542  lisa.gillette@enron.com .  Lisa  Gillette . 

22  1117  david.fairley@enron.com .  David  Fairley . 

22  14668  kim.theriot@enron.com .  Kim  Theriot . 

22  269  genia.fitzgerald@enron.com .  Genia  Fitzgerald.... 

22  19867  james.shirley@enron.com . 

sit***********************************************************************: 

CATEGORY  23 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 
VERTICES:  115 

LARGEST  COMPONENT  SIZE:  115 
GROUP  DEGREE:  0.69523 
GROUP  CLOSENESS:  0.33713 
AVERAGE  p(z|u) :  0.29 


COMPONENTS:  1 

PERCENT  OF  TOTAL  GRAPH:  100°/, 
GRAPH  DENSITY:  0.05263 
GROUP  BETWEENNESS:  0.66430 
STDEV  p(z|u) :  0.37 


MOST  PROBABLE  USERS 


Topic#  ID# 
23  21 

23  8 

23  47 

23  345 

23  566 

23  19 

23  156 

23  20 

23  14 

23  17 

23  15 

23  230 

23  12 

23  79 

23  306 

23  18 

23  1158 

23  78 

23  1057 

23  582 


Email  Address 

kate . symes@enron .com . 

bill . williams@enron . com . 

cara . semperger@enron . com . 

will . smith@enron .com . 

evelyn . metoyer@enron . com . 

ryan . slinger@enron . com . 

david . poston@enron . com . 

geir . solberg@enron . com . 

mark . guzman@enron . com . 

bert . meyers@enron . com . 

leaf . harasin@enron . com . 

corry . bentley@enron . com . 

craig . dean@enron .com . 

eric . linder@enron . com . 

duong . luu@enron . com . 

v. .porter@enron.com . 

vishwanatha . venkataswami@enron . com 

todd . bland@enron .com . 

kimberly . hundl@enron . com . 

Stephanie . piwetz@enron . com . 


Name 

Kate  Symes . 

Bill  Williams  III. . . 

Cara  Semperger . 

Will  Smith . 

Evelyn  Metoyer . 

Ryan  Slinger . 

David  Poston . 

Geir  Solberg . 

Mark  Guzman . 

Bert  Meyers . 

Leaf  Harasin . 

Corry  Bentley . 

Craig  Dean . 

Eric  Linder . 

Duong  Luu . 

David  V.  Porter . 

Vishwanatha  Venkatas 

Todd  Bland . 

Kimberly  Hundl . 

Stephanie  Piwetz. . . . 


0.005187 
0.005035 
0.004921 
0 . 004746 
0.004723 
0.004711 
0 . 004640 
0.004556 
0.004538 

****************************************** 


p(z|u) 

0.250872 

0.129422 

0.062312 

0.026454 

0.025118 

0.023141 

0.022346 

0.020995 

0.020132 

0.019710 

0.017548 

0.017445 

0.016386 

0.016381 

0.015535 

0.012784 

0.012631 

0.012504 

0.012452 

0.012158 
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CATEGORY  24 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  4231  COMPONENTS:  6 

LARGEST  COMPONENT  SIZE:  4220  PERCENT  OF  TOTAL  GRAPH:  99.74'/, 
GROUP  DEGREE:  0.20961  GRAPH  DENSITY:  0.00047 

GROUP  CLOSENESS:  0.00563  GROUP  BETWEENNESS:  0.28964 

AVERAGE  p(z | u) :  0.43  STDEVp(zlu):  0.42 


MOST  PROBABLE  USERS 


Topic#  ID# 

24 

6031 

24 

9321 

24 

1493 

24 

41163 

24 

228 

24 

283 

24 

2469 

24 

2466 

24 

18593 

24 

2019 

24 

19381 

24 

24665 

24 

785 

24 

787 

24 

10552 

24 

406 

24 

1213 

24 

114 

24 

641 

24 

6891 

Email  Address  Name 

outlook.teamOenron. com . 

susan . lopezSenr on .com . 

april . hrach@enron . com . 

gwhalleOect .  enr  on  .com . 

michael.belmontOenron.com .  Michael  Belmont . 

bob.hillierOenron.com .  Bob  Hillier . 

vicky.haOenron.com .  Vicky  Ha . 

victor.brownerOenron.com .  Victor  Browner . 

scott.williamsonOenron.com .  Scott  Williamson.... 

r. .harringtonOenron.com . 

jim. f ussellOenron . com . 

Stephen .  harr  ingtonOenron  .com . 


jarod.jensonOenron.com .  Jarod  Jenson.. 

kevin.montagneOenron.com .  Kevin  Montagne 

dan . bruceOenron .com . 


michael . guadarramaOenron. com . 

backbone . ensOenron .com . 

gray . calvertOenron .com . 

michael.barberOenron.com .  Michael  Barber . 

gail.kettenbrinkOenron.com .  Gail  Kettenbrink .  .  .  . 


p(z|u) 

0.012505 

0.002325 

0.002086 

0.002050 

0.001982 

0.001928 

0.001902 

0.001881 

0.001849 

0.001796 

0.001770 

0.001770 

0.001750 

0.001737 

0.001732 

0.001730 

0.001727 

0.001717 

0.001714 

0.001712 


CATEGORY  25 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 


VERTICES:  398 

LARGEST  COMPONENT  SIZE:  266 
GROUP  DEGREE:  0.23041 
GROUP  CLOSENESS:  0.00155 
AVERAGE  p(z|u) :  0.71 

MOST  PROBABLE  USERS 


COMPONENTS:  18 

PERCENT  OF  TOTAL  GRAPH:  66.83'/, 
GRAPH  DENSITY:  0.00756 
GROUP  BETWEENNESS:  0.31710 
STDEV  p(z|u) :  0.39 
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Topic#  ID# 

25 

242 

25 

1145 

25 

1559 

25 

4134 

25 

773 

25 

499 

25 

621 

25 

4654 

25 

1140 

25 

1465 

25 

2202 

25 

19919 

25 

1691 

25 

11250 

25 

1689 

25 

7568 

25 

32271 

25 

11156 

25 

9264 

25 

1617 

Email  Address  Name 

michelle.cash@enron.com .  Michelle  Cash.. 

benjcunin.rogers@enron.com .  Benjamin  Rogers 

j  ohn . arnold@enron . com . 


j  ef f rey . shankman@enron . com . 

.ward@enron.com .  houston . 

dana.davis@enron.com .  Mark  Dana  Davis . 

jason.wolfe@enron.com .  Jason  Wolfe . 

j  ennif er . burns@enron . com . 

joe.stepenovitch@enron.com .  Joe  Stepenovitch.  .  .  . 

twanda . sweet@enron . com . 

a.  .  shankman@enron.  com .  Jeffrey  A.  Shankman. 

lavorato@enron . com . 

don . miller@enron .com . 

e . taylor@enron . com . 

jennifer.medcalf@enron.com . 


monique .  sanchez@enron .  com .  Monique  Sanchez 

jarnold@enron.com .  jarnold . 

sharon.butcher@enron.com .  Sharon  Butcher. 

kriste . sullivan@enron . com . 

j  ennif er . f raser@enron . com . 


************************************************************************: 
CATEGORY  26 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  408  COMPONENTS:  11 

LARGEST  COMPONENT  SIZE:  352  PERCENT  OF  TOTAL  GRAPH:  86. 27°/. 
GROUP  DEGREE:  0.21545  GRAPH  DENSITY:  0.00491 

GROUP  CLOSENESS:  0.00230  GROUP  BETWEENNESS:  0.37435 

AVERAGE  p(z | u) :  0.66  STDEVp(zlu):  0.42 


MOST  PROBABLE  USERS 


Topic#  ID# 
26  2781 
26  1078 
26  4058 
26  427 
26  412 
26  2599 
26  1175 
26  2883 
26  86 
26  6007 


Email  Address 

enron . announcement s@enron . com 

40enron@enron . com . 

jeff . skilling@enron. com . 

chris . dorland@enron . com . 

no . address@enron .com . 

matt . smith@enron .com . 

arsystem@mailman.enron.com. . . 

all . houston@enron . com . 

all . worldwide@enron . com . 

houston . report@enron . com . 


Name 


Tracey  Ramsey  -  Glob 


Chris  Dorland 


Matt  Smith 
ARSystem. . 


All  Enron  Worldwide . 


p(z|u) 

0.073555 

0.057591 

0.043092 

0.028072 

0.020803 

0.018514 

0.017101 

0.015920 

0.013753 

0.013010 

0.013010 

0.012111 

0.012004 

0.010706 

0.008170 

0.007925 

0.007245 

0.006716 

0.006530 

0.005907 

:****************************************** 


p(z|u) 

0.069997 

0.067741 

0.063704 

0.061735 

0.044696 

0 . 043858 

0.028197 

0.021788 

0.019729 

0.014175 
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26  22081  dfarmer@enron.com 


0.013624 


26 

26 

26 

26 

26 

26 

26 

26 

26 


84  all_ena_egm_eim@enron.com .  A11_ENA_EGM_EIM 

710  larry.may@enron.com .  Larry  May . 

1757  colin.tonks@enron.com . 

428  dan.dorland@enron.com .  Dan  Dorland.  .  .  . 

262  david.dronet@enron.com .  David  Dronet .  .  . 

4762  office.chairman@enron.com . 

516  chris.gaskill@enron.com .  Chris  Gaskill.. 

2387  tara.piazze@enron.com .  Tara  Piazze.... 


6226  enr on .  announcement@enron .  com 


0.011844 
0.010501 
0 . 009049 
0.008364 
0.006543 
0.006421 
0.005673 
0.005357 
0.004674 


***************************************************************************************************************** 
CATEGORY  27 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  405  COMPONENTS:  2 

LARGEST  COMPONENT  SIZE:  399  PERCENT  OF  TOTAL  GRAPH:  98.52"/, 
GROUP  DEGREE:  0.66146  GRAPH  DENSITY:  0.01980 

GROUP  CLOSENESS:  0.01750  GROUP  BETWEENNESS:  0.47792 

AVERAGE  p(z | u) :  0.33  STDEVp(zlu):  0.37 


MOST  PROBABLE  USERS 


Topic#  ID# 
27  367 

27  268 

27  62 

27  226 

27  1104 

27  1120 

27  588 

27  618 

27  1126 

27  1044 

27  1140 

27  1117 

27  1110 

27  292 

27  1096 

27  2810 

27  267 

27  2740 

27  2813 

27  314 


Email  Address 

w. .white@enron.com . 

casey . evans@enr on .com . 

john.postlethwaite@enron. com 

don . baughman@enron .com . 

kayne . coulter@enron . com . 

juan.hernandez@enron.com. . . . 
reagan . ror schach@enr on . com . . 

lloyd . will@enron .com . 

t  om . may @enr on .com . 

rhonda .  denton@enron .  com . 

j  oe . stepenovitch@enr on . com . . 

david.fairley@enron.com . 

rudy . acevedo@enron .com . 

john. kinser@enron .com . 

dean . laur ent@enron .com . 

larry.campbell@enron.com. . . . 

joe. err igo@enron .com . 

chad . starnesSenron .com . 

miguel . garcia@enron. com . 

jeffrey.miller@enron.com. . . . 


Name 

Stacey  W.  White . 

Casey  Evans . 

John  Postlethwaite . . 

Don  Baughman  Jr . 

Kayne  Coulter . 

Juan  Hernandez . 

Reagan  Rorschach. . . . 

Lloyd  Will . 

Tom  May . 

Rhonda  Denton . 

Joe  Stepenovitch . . . . 

David  Fairley . 

Rudy  Acevedo . 

John  Kinser . 

Dean  Laurent . 


Joe  Errigo 


Jeffrey  Miller 


p(z|u) 

0.064266 

0.025626 

0.020753 

0.020010 

0.018961 

0.017268 

0.016150 

0.015298 

0.013930 

0.012500 

0.012058 

0.011785 

0.010945 

0.010925 

0.010373 

0.009907 

0.009482 

0.009398 

0.009299 

0.008834 
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CATEGORY  28 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  385  COMPONENTS:  3 

LARGEST  COMPONENT  SIZE:  381  PERCENT  OF  TOTAL  GRAPH:  98. 96'/. 
GROUP  DEGREE:  0.43526  GRAPH  DENSITY:  0.01302 

GROUP  CLOSENESS:  0.03227  GROUP  BETWEENNESS:  0.57669 

AVERAGE  p(z | u) :  0.34  STDEVp(zlu):  0.37 


MOST  PROBABLE  USERS 


Topic#  ID# 

28 

765 

28 

1769 

28 

717 

28 

712 

28 

709 

28 

619 

28 

2591 

28 

719 

28 

773 

28 

14660 

28 

3605 

28 

774 

28 

2515 

28 

1676 

28 

3548 

28 

2362 

28 

2520 

28 

581 

28 

747 

28 

3111 

Email  Address 

barry . ty cholizSenr on . com . . 

mark.whittSenron.com . 

stephanie.millerSenron.com 
jonathan.mckaySenron.com. . 

a. .martinSenron.com . 

.williamsSenron. com . 

jim.schwiegerSenron.com. . . 

1 . . mimsSenron .com . 

. wardSenr on .com . 

theresa.staabSenron.com. . . 

t. .lucciSenron.com . 

charles.weldonSenron.com. . 
martin.cuillaSenron.com. . . 

laura.luceSenron.com . 

tyrell.harrisonSenron. com. 

gary . bryanSenron .com . 

tom . donohoeSenron . com . 

vladi.pimenovSenron.com. . . 

s . . shivelySenron .com . 

gerald.nemecSenron.com. . . . 


Name 

Barry  Tycholiz 


Stephanie  Miller. . . . 

Jonathan  Mckay . 

Thomas  A.  Martin.... 

credit . 

Jim  Schwieger . 

Patrice  L.  Mims . 

houston . 


Paul  T.  Lucci . 

V.  Charles  Weldon. . . 
Martin  Cuilla . 


Tyrell  Harrison . 

Gary  Bryan . 

Tom  Donohoe . 

Vladi  Pimenov . 

Hunter  S.  Shively... 


p(z|u) 

0.078402 

0.048881 

0.036904 

0.030210 

0.028498 

0.028301 

0.025808 

0.019376 

0.016969 

0.014697 

0.013274 

0.012376 

0.012031 

0.010473 

0.010358 

0.010155 

0.010003 

0.009959 

0.009723 

0.009570 


CATEGORY  29 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 


VERTICES:  223 

LARGEST  COMPONENT  SIZE:  213 
GROUP  DEGREE:  0.29045 
GROUP  CLOSENESS:  0.01056 
AVERAGE  p(z|u) :  0.29 

MOST  PROBABLE  USERS 


COMPONENTS:  2 

PERCENT  OF  TOTAL  GRAPH:  95. 52'/. 
GRAPH  DENSITY:  0.02703 
GROUP  BETWEENNESS:  0.29059 
STDEV  p(z|u) :  0.34 
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Topic#  ID# 

29 

724 

29 

2530 

29 

764 

29 

521 

29 

2537 

29 

739 

29 

550 

29 

602 

29 

1048 

29 

2563 

29 

2608 

29 

2578 

29 

2548 

29 

652 

29 

14705 

29 

770 

29 

514 

29 

2350 

29 

2349 

29 

2604 

Email  Address 

scott . neal@enron .com . 

chris.germany@enron.com. . . 
judy.townsend@enron.com. . . 
scott.goodell@enron.com. . . 

j  ohn . hodge@enron .com . 

andrea . r ing@enron . com . 

victor . lamadrid@enron . com . 
robert . superty@enron . com . . 

m . . f orney@enron . com . 

brad . mckay@enron .com . 

victoria . versen@enron . com . 

w. .pereira@enron.com . 

f . . keavey@enron . com . 

f . . brawner@enron .com . 

dick.jenkins@enron.com. . . . 
frank.vickers@enron.com. . . 
Clarissa . garcia@enron . com . 

chuck . ames@enron .com . 

craig.taylor@enron.com. . . . 
colleen.sullivan@enron.com 


Name  p(z|u) 

Scott  Neal .  0.084950 

Chris  Germany .  0.053341 

Judy  Townsend .  0.032323 

Scott  Goodell .  0.029347 

John  Hodge .  0.024274 

Andrea  Ring .  0.023856 

Victor  Lamadrid .  0.023635 

Robert  Superty .  0.023183 

John  M.  Forney .  0.019497 

Brad  Mckay .  0.019347 

Victoria  Versen .  0.017748 

Susan  W.  Pereira....  0.016941 

Peter  F.  Keavey .  0.016556 

Sandra  F.  Brawner. . .  0.015440 

Dick  Jenkins .  0.014574 

Frank  Vickers .  0.014021 

Clarissa  Garcia .  0.013714 

Chuck  Ames .  0.013374 

Craig  Taylor .  0.013243 

Colleen  Sullivan....  0.012638 


******************************************************************************************************************* 
CATEGORY  30 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  482  COMPONENTS:  1 

LARGEST  COMPONENT  SIZE:  482  PERCENT  OF  TOTAL  GRAPH:  1007. 
GROUP  DEGREE:  0.43150  GRAPH  DENSITY:  0.01040 

GROUP  CLOSENESS:  0.18911  GROUP  BETWEENNESS:  0.41663 

AVERAGE  p(z | u) :  0.48  STDEVp(zlu):  0.42 


MOST  PROBABLE  USERS 


Topic#  ID#  Email  Address  Name 

30  2381  richard.ringSenron.com .  Richard  Ring... 

30  1416  christi.nicolaySenron.com .  Christi  Nicolay 

30  2823  kevin.prestoSenron.com . 

30  2206  stacey.boltonSenron.com .  Stacey  Bolton.. 

30  817  richard.shapiroSenron.com .  Richard  Shapiro 

30  1489  james.steffesSenron.com . 

30  813  sarah.novoselSenron.com . 

30  1642  rogers.herndonSenron.com . 

30  63  elliot.mainzerSenron.com .  Elliot  Mainzer. 


30  519  doug.gilbert-smithSenron.com .  Doug  Gilbert-smith.. 


p(z|u) 

0.031736 

0.021109 

0.016200 

0.014342 

0.013577 

0.012671 

0.012512 

0.011073 

0.010418 

0.009125 
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30 


0.009028 


30 

30 

30 

30 

30 

30 

30 

30 

30 


618  lloyd.willSenron.com .  Lloyd  Will.... 

2229  janine.migdenSenron.com .  Janine  Migden. 

2382  mark.bernsteinSenron.com .  Mark  Bernstein 

2259  thane.twiggsSenron.com .  Thane  Twiggs.. 


2809  edward . baughmanSenron .com . 

1553  jeff.aderSenron.com . 

413  john.llodraSenron.com .  John  Llodra 

5568  joe.kishkillSenron.com . 

2784  ron.mcnamaraSenron.com . 

1485  donna.fultonSenron.com . 


0.008796 

0.008443 

0.007970 

0.007922 

0.007858 

0.007837 

0.007809 

0.007789 

0.007367 


***************************************************************************************************************** 
CATEGORY  31 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 
VERTICES:  302 

LARGEST  COMPONENT  SIZE:  302 
GROUP  DEGREE:  0.38487 
GROUP  CLOSENESS:  0.19020 
AVERAGE  p(z|u) :  0.53 


COMPONENTS:  1 

PERCENT  OF  TOTAL  GRAPH:  100°/, 
GRAPH  DENSITY:  0.05316 
GROUP  BETWEENNESS:  0.11591 
STDEV  p(z|u) :  0.40 


MOST  PROBABLE  USERS 
Topic#  ID#  Email  Address 

31  37  tim.beldenOenron.com . 

31  59  diana.scholtesOenron.com... 

31  54  sean.crandallOenron.com.... 

31  42  jeff.richterOenron.com . 

31  55  robert.badeerOenron.com.... 

31  57  matt.motleyOenron.com . 

31  60  mike.swerzbinOenron.com.... 

31  52  tom.alonsoOenron.com . 

31  53  mark.fischerOenron.com . 

31  93  phillip.platterOenron.com.. 

31  38  chris.malloryOenron.com.... 

31  92  holden.salisburyOenron.com. 

31  175  lisa.gangOenron.com . 

31  8  bill.williamsOenron.com.... 

31  124  chris.stokleyOenron.com.... 

31  36  alan.comnesOenron.com . 

31  14  mark.guzmanOenron.com . 

31  110  heather.duntonOenron.com... 

31  115  m. .driscollOenron. com . 

31  41  h. . f osterOenron. com . 


Name 

Tim  Belden . 

Diana  Scholtes . 

Sean  Crandall . 

Jeff  Richter . 

Robert  Badeer . 

Matt  Motley . 

Mike  Swerzbin . 

Tom  Alonso . 

Mark  Fischer . 

Phillip  Platter . 

Chris  Mallory . 

Holden  Salisbury. . . . 


Bill  Williams  III. . . 

Chris  Stokley . 

Alan  Comnes . 

Mark  Guzman . 


Michael  M.  Driscoll. 
Chris  H.  Foster . 


p(z|u) 

0.042943 

0.036263 

0.030541 

0.029644 

0.027046 

0.026607 

0.024860 

0.024538 

0 . 024408 

0.023640 

0.021094 

0.019483 

0.017121 

0.016884 

0.013516 

0.012209 

0.012191 

0.011767 

0.011648 

0.011597 
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CATEGORY  32 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 
VERTICES:  448 

LARGEST  COMPONENT  SIZE:  443 
GROUP  DEGREE:  0.24667 
GROUP  CLOSENESS:  0.02283 
AVERAGE  p(z|u) :  0.48 


COMPONENTS:  2 

PERCENT  OF  TOTAL  GRAPH:  98. 
GRAPH  DENSITY:  0.01566 
GROUP  BETWEENNESS:  0.25615 
STDEV  p(zlu) :  0.40 


887. 


MOST  PROBABLE  USERS 


Topic#  ID# 

32 

1490 

32 

817 

32 

1463 

32 

1547 

32 

2209 

32 

5416 

32 

1779 

32 

5956 

32 

28729 

32 

818 

32 

253 

32 

2256 

32 

2157 

32 

2214 

32 

1570 

32 

2220 

32 

2155 

32 

28730 

32 

14954 

32 

9314 

Email  Address 

Steven . kean@enr on .com . 

richard. shapiroOenron. com. . 
maureen.mcvicker@enron. com . 

mark . palmerSenr on .com . 

sue  .  nordOenr  on  .com . 

mark.schroeder@enron.com. . . 

lisa.yoho@enron. com . 

michael.terraso@enron.com. . 

scott.bolton@enron.com . 

linda.robertson@enron.com. . 
jeff.dasovich@enron.com. . . . 
marchris . robinsonSenron . com 
elizabeth . linnellSenron . com 
stephen.burns@enron.com.  .  .  . 

rob .  bradley@enr  on  .com . 

Chris .  long@enron.com . 

lara. leibman@enron. com . 

susan.landwehr@enron.com. . . 

mary . schoenSenr on .com . 

jeffrey.keeler@enron.com.  .  . 


Name 


Richard  Shapiro 


Sue  Nord 


Linda  Robertson . 

Jeff  Dasovich . 

Marchris  Robinson. . . 
Elizabeth  Linnell. .  . 
Stephen  Burns . 


Chris  Long . . 
Lara  Leibman 


p(z|u) 

0.080563 

0.045726 

0.017581 

0.016680 

0.015046 

0.014551 

0.014144 

0.013448 

0.012918 

0.012390 

0.012386 

0.012175 

0.011677 

0.011380 

0.010403 

0.010320 

0.010191 

0.010097 

0.010035 

0.009731 


CATEGORY  33 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 
VERTICES:  316 

LARGEST  COMPONENT  SIZE:  316 
GROUP  DEGREE:  0.43672 
GROUP  CLOSENESS:  0.21378 
AVERAGE  p(z|u) :  0.33 


COMPONENTS:  1 

PERCENT  OF  TOTAL  GRAPH:  1007. 
GRAPH  DENSITY:  0.06667 
GROUP  BETWEENNESS:  0.09600 
STDEV  p(zlu) :  0.36 


MOST  PROBABLE  USERS 
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Topic#  ID# 
33  3100 
33  15296 
33  18647 
33  19980 
33  288 
33  5897 
33  9094 
33  3113 
33  9244 
33  266 
33  1019 
33  19792 
33  12004 
33  3103 
33  4135 
33  20015 
33  20033 
33  17099 
33  155 
33  19400 


Email  Address 

alan.aronowitz@enron.com. . . 
jeffrey.hodge@enron.com. . . . 

carol . clair@enron . com . 

harry.collins@enron.com. . . . 

tana . j  ones@enron .com . 

mark . taylor@enron . com . 

stacy.dickson@enron.com. . . . 
sara.shackleton@enron.com. . 
richard . sanders@enron . com . . 
janette . elbertson@enron.com 
leslie.hansen@enron.com. . . . 
taffy.milligan@enron.com. . . 
suzanne.adams@enron.com. . . . 

robert . bruce@enron . com . 

mark.haedicke@enron.com. . . . 
john.viverito@enron.com. . . . 

kaye . ellis@enron . com . 

shari . stack@enron . com . 

elizabeth.sager@enron.com. . 
michael.robison@enron.com. . 


Name 


Tana  Jones 


Janette  Elbertson. . . 
Leslie  Hansen . 


Suzanne  Adams 


Elizabeth  Sager 


p(z|u) 

0.034792 

0.033737 

0.024891 

0.024811 

0.020152 

0.019317 

0.019237 

0.018068 

0.016180 

0.015597 

0.015299 

0.015121 

0.014889 

0.013211 

0.012553 

0.011790 

0.011554 

0.011208 

0.011204 

0.011097 


******************************************************************************************************************* 
CATEGORY  34 

EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  256  COMPONENTS:  2 

LARGEST  COMPONENT  SIZE:  254  PERCENT  OF  TOTAL  GRAPH:  99.22"/, 

GROUP  DEGREE:  0.50996  GRAPH  DENSITY:  0.01569 

GROUP  CLOSENESS:  0.06689  GROUP  BETWEENNESS:  0.43588 

AVERAGE  p(z | u) :  0.29  STDEVp(zlu):  0.36 


MOST  PROBABLE  USERS 


34 

34 

34 


Topic# 

ID#  Email  Address 

Name 

p(z|u) 

34 

293  louise.kitchen@enron.com . 

.  Louise  Kitchen. 

.  0.198510 

34 

701  john.lavorato@enron.com . 

.  John  Lavorato.. 

.  0.126343 

34 

1452  david.delainey@enron.com . 

.  David  Delainey. 

.  0.061198 

34 

37  tim.belden@enron.com . 

.  Tim  Belden . 

.  0.036809 

34 

445  rob.milnthorp@enron.com . . 

.  Rob  Milnthorp. . 

.  0.035902 

34 

1453  janet.dietrich@enron.com . 

.  Janet  Dietrich. 

.  0.030647 

34 

34  f . . calger@enron . com . 

.  Christopher  F. 

Calge  0.026040 

2823  kevin.prestoSenron.com . 

1477  greg.whalleySenron.com . 

618  lloyd.willSenron.com .  Lloyd  Will. 


0.024960 

0.020042 

0.018848 
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34 


David  Oxley 


0.017276 


34 

34 

34 

34 

34 

34 

34 

34 

34 


222  david.oxley8enron.com . 

168  christopher.calger8enron.com 

275  e . . haedicke8enron . com . 

743  tammie.schoppe8enron.com.... 

782  don.black8enron.com . 

2357  brian.redmond8enron.com . 

495  wes.colwell8enron.com . 

60  mike.swerzbin8enron.com . 

3483  joseph.deffner8enron.com. . . . 
3608  jean.mrha8enron.com . 


0.016651 


Mark  E.  Haedicke _  0.016644 

Tammie  Schoppe .  0.015659 

Don  Black .  0.014802 

Brian  Redmond .  0.014781 

Wes  Colwell .  0.014399 

Mike  Swerzbin .  0.010634 

.  0.010007 

.  0.009001 


***************************************************************************************************************** 
CATEGORY  35 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 
VERTICES:  535 

LARGEST  COMPONENT  SIZE:  535 
GROUP  DEGREE:  0.47224 
GROUP  CLOSENESS:  0.27967 
AVERAGE  p(z|u) :  0.36 


COMPONENTS:  1 

PERCENT  OF  TOTAL  GRAPH:  100% 
GRAPH  DENSITY:  0.01311 
GROUP  BETWEENNESS:  0.46781 
STDEV  p(z|u) :  0.36 


MOST  PROBABLE  USERS 


Topic#  ID# 

35 

227 

35 

1771 

35 

6399 

35 

10227 

35 

6526 

35 

14781 

35 

334 

35 

6396 

35 

3029 

35 

3002 

35 

4796 

35 

1978 

35 

6398 

35 

3402 

35 

19242 

35 

15922 

35 

3455 

35 

8963 

35 

1292 

35 

3032 

Email  Address 

sally.beck8enron.com . 

shona .  wilson8enron  .com . 

beth . apolloSenr on .com . 

brent  .priceOenron .  com . 

mike . j  ordan8enr on .com . 

patti.thompson8enron.com. . . 
leslie.reeves8enron.com. . . . 
fernley.dyson8enron.com. . . . 
sheila.glover8enron.com. . . . 

m .  hallSenron  .com . 

ted. murphy 8enr on .com . 

greg .  piper8enron  .com . 

chris . abel8enron.com . 

kristin.albrecht8enron.com . 
mary.solmonson8enron.com. . . 

bob .  hallSenr  on  .com . 

barry . pear ce8enron .com . 

hector .mcloughlin8enron . com 
Cassandra. schultzSenron . com 
sheri . thomasSenron .com . 


Name 

Sally  Beck 


Leslie  Reeves 
Fernley  Dyson 
Sheila  Glover 
Bob  M  Hall. . . 


Hector  McLoughlin. . . 
Cassandra  Schultz. . . 
Sheri  Thomas . 


p(z|u) 

0.184642 

0.027381 

0.023193 

0.021862 

0.018646 

0.016208 

0.016130 

0.015534 

0.014537 

0.012954 

0.012286 

0.010775 

0.010679 

0.010008 

0.009424 

0.009192 

0.008548 

0.008350 

0.008282 

0.008241 
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CATEGORY  36 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  367  COMPONENTS:  10 

LARGEST  COMPONENT  SIZE:  357  PERCENT  OF  TOTAL  GRAPH:  97.28'/, 
GROUP  DEGREE:  0.51432  GRAPH  DENSITY:  0.02732 

GROUP  CLOSENESS:  0.00888  GROUP  BETWEENNESS:  0.33721 

AVERAGE  p(z | u) :  0.71  STDEVp(zlu):  0.38 


MOST  PROBABLE  USERS 

Topic#  ID#  Email  Address  Name 

36  3113  sara.shackleton@enroii.com . 

36  288  tana.jones@enron.com .  Tana  Jones . 

36  280  marie.heard@enron.com .  Marie  Heard.... 

36  1102  tom.moran@enron.com .  Tom  Moran . 

36  3101  susan.bailey@enron.com . 

36  3098  stephanie.panus@enron.com .  Stephanie  Panus 

36  1101  tanya.rohauer@enron.com .  Tanya  Rohauer  .  . 

36  33  stephanie.sever@enron.com .  Stephanie  Sever 

36  551  lisa.lees@enron.com .  Lisa  Lees . 


36  22643  pmims@enron.com . 

36  18647  carol.clair@enron.com 


36  1142  karen.lambert@enron.com .  Karen  Lambert. 

36  3404  samantha.boyd@enron.com . 

36  1449  samuel.schott@enron.com .  Samuel  Schott. 

36  20682  cheryl.nelson@enron.com .  Cheryl  Nelson. 

36  1099  brant.reves@enron.com .  Brant  Reves  .  .  . 

36  8874  kelly.lombardi@enron.com .  Kelly  Lombardi 

36  1181  exchange.administrator@enron.com . 

36  3029  sheila.glover@enron.com .  Sheila  Glover. 

36  5583  frank.davis@enron.com . 


p(z|u) 

0.123371 

0.073892 

0.023954 

0.018733 

0.018431 

0.017748 

0.016405 

0.015137 

0.013782 

0.012239 

0.011726 

0.011010 

0.009015 

0.008093 

0.007243 

0.007189 

0.006891 

0.006090 

0.006041 

0.005975 


CATEGORY  37 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 


VERTICES:  625 

LARGEST  COMPONENT  SIZE:  623 
GROUP  DEGREE:  0.42207 
GROUP  CLOSENESS:  0.06416 
AVERAGE  p(z|u) :  0.77 

MOST  PROBABLE  USERS 


COMPONENTS:  2 

PERCENT  OF  TOTAL  GRAPH:  99.68'/, 
GRAPH  DENSITY:  0.01923 
GROUP  BETWEENNESS:  0.32801 
STDEV  p(z|u) :  0.35 
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Topic#  ID# 

37 

253 

37 

817 

37 

1489 

37 

801 

37 

181 

37 

347 

37 

2222 

37 

1490 

37 

36 

37 

813 

37 

17095 

37 

818 

37 

8546 

37 

1474 

37 

1479 

37 

1180 

37 

28654 

37 

812 

37 

800 

37 

2326 

Email  Address 
j  ef f . dasovich@enron . com . . 
richard . shapiroOenron . com 
james.steffes@enron.com. . 

susan . mara@enron .com . 

paul.kaufman@enron.com. . . 

d. .steffes@enron.com . 

harry . kingerski@enron . com 
steven.kean@enron.com. . . . 
alan.comnes@enron.com. . . . 
sarah.novosel@enron.com. . 

mary . hain@enron . com . 

linda . robertson@enron . com 
sandra . mccubbin@enron . com 
joe.hartsoe@enron.com. . . . 
leslie . lawner@enron . com . . 
karen.denne@enron.com. . . . 
mona . petrochko@enron . com . 

1. .nicolay@enron.com . 

ray.alvarez@enron.com. . . . 
j  anel . guerrero@enron . com . 


Name 

Jeff  Dasovich. . 
Richard  Shapiro 


Susan  Mara 


James  D.  Steffes.... 
Harry  Kingerski . 


Alan  Comnes 


Linda  Robertson 
Sandra  McCubbin 


Karen  Denne 


Ray  Alvarez 


p(z|u) 

0.076816 

0.049471 

0.038013 

0.035751 

0.029256 

0.029135 

0.021314 

0.018507 

0.018317 

0.017805 

0.015869 

0.015279 

0.012470 

0.012146 

0.012061 

0.011729 

0.008652 

0.008379 

0.008375 

0.008131 


******************************************************************************************************************* 
CATEGORY  38 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  182  COMPONENTS:  13 

LARGEST  COMPONENT  SIZE:  141  PERCENT  OF  TOTAL  GRAPH:  77. 47°/. 
GROUP  DEGREE:  0.38956  GRAPH  DENSITY:  0.01105 

GROUP  CLOSENESS:  0.00442  GROUP  BETWEENNESS:  0.40365 

AVERAGE  p(z | u) :  0.70  STDEVp(zlu):  0.42 


MOST  PROBABLE  USERS 


Topic#  ID# 

38 

4110 

38 

3441 

38 

3536 

38 

4664 

38 

4058 

38 

4104 

38 

14 

38 

14935 

38 

4116 

38 

80 

Email  Address 

klay@enron .com . 

kenneth . lay@enron . com . 

rosalee.fleming@enron.com. . . 

sherri . sera@enron . com . 

jeff . skilling@enron. com . 

j  oannie . williamson@enron . com 

mark . guzman@enron . com . 

susan . scott@enron . com . 

katherine.brown@enron.com. . . 
john. zuf f erli@enron. com . 


Name 


Rosalee  Fleming 


Mark  Guzman 


p(z|u) 

0.114435 

0.088106 

0.051847 

0.041385 

0.019147 

0.016989 

0.016475 

0.009385 

0.006860 

0.005168 
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38  11168  tobin.carlson@enron.com 


0.004727 


38 

12 

craig . dean@enron .com . 

.  Craig 

Dean . . 

.  0.004022 

38 

6948 

gary . stadler@enron . com . 

.  Gary  Stadler . . 

.  0.004008 

38 

2288 

misha . siegel@enron . com . 

.  Misha 

Siegel . 

.  0.003858 

38 

5485 

tori . wells@enron .com . 

.  0.003184 

38 

4063 

sherri . reinartz@enron . com . 

.  0.002397 

38 

24446 

executive . office@enron.com . 

.  Office 

i  of  the  Chief 

0.002256 

38 

5026 

sally . keepers@enron . com . 

.  0.002184 

38 

19276 

nicholas . stephan@enron . com . 

.  0.002027 

38 

5645 

wilson . kriegel@enron . com . 

.  0.001937 

**************************************************************************************************************** 
CATEGORY  39 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  293  COMPONENTS:  2 

LARGEST  COMPONENT  SIZE:  291  PERCENT  OF  TOTAL  GRAPH:  99.32"/, 
GROUP  DEGREE:  0.32581  GRAPH  DENSITY:  0.01370 

GROUP  CLOSENESS:  0.06182  GROUP  BETWEENNESS:  0.35442 
AVERAGE  p(z | u) :  0.3  STDEVp(zlu):  0.36 


MOST  PROBABLE  USERS 


Topic#  ID# 
39  1544 
39  2287 
39  2359 
39  4134 
39  1477 
39  4132 
39  1978 
39  788 
39  4796 
39  2202 
39  365 
39  1625 
39  2370 
39  4133 
39  481 
39  2383 
39  5062 
39  1200 
39  1206 
39  293 


Email  Address  Name 

rick.  buySenron.  com . 

mike.mcconnellSenron.com .  Mike  Mcconnell 

andy.zipperSenron.com .  Andy  Zipper... 


jeffrey . shankmanSenron. com . 

greg .  whalleySenron  .com . 

john. sherrif f Senron. com . 

greg . piperSenron .com . 

david.portSenron.com .  David  Port . 

ted .murphy Senron. com . 

a.  .  shankmanSenron .  com .  Jeffrey  A.  Shankman. 

jay.webbSenron.com .  Jay  Webb . 

david . gorteSenr on .com . 

vladimir.gornySenron.com .  Vladimir  Gorny . 

philippe  .bibiSenron.  com . 

s  .  .bradf  ordSenron .  com .  William  S.  Bradford. 

michael.brownSenron.com .  Michael  Brown . 

cathy .phillipsSenron . com . 

savita.puthigaiSenron.com .  Savita  Puthigai . 

brad.  richterSenron.  com . 

louise.kitchenSenron.com .  Louise  Kitchen . 


p(z|u) 

0.079570 

0.071530 

0 . 047493 

0.038991 

0 . 037847 

0.028263 

0.027670 

0.025199 

0.024818 

0.022218 

0.017257 

0.015062 

0.011995 

0.010761 

0.010596 

0.010569 

0.010036 

0.009958 

0.009650 

0.009375 
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CATEGORY  40 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 
VERTICES:  31 

LARGEST  COMPONENT  SIZE:  31 
GROUP  DEGREE:  0.49540 
GROUP  CLOSENESS:  0.16908 
AVERAGE  p(z|u) :  0.56 


COMPONENTS:  1 

PERCENT  OF  TOTAL  GRAPH:  100°/, 
GRAPH  DENSITY:  0.10000 
GROUP  BETWEENNESS:  0.56133 
STDEV  p (z | u) :  0.38 


MOST  PROBABLE  USERS 
Topic#  ID#  Email  Address 

40  256  pete.davis@enron.com . 

40  14  mark.guzman@enron.com . 

40  19  ryan.slinger@enron.com . 

40  20  geir.solberg@enron.com . 

40  12  craig.dean@enron.com . 

40  8  bill.williams@enron.com . 

40  15  leaf.harasin@enron.com . 

40  17  bert.meyers@enron.com . 

40  79  eric.linder@enron.com . 

40  108  albert.meyers@enron.com . 

40  219  michael.mier@enron.com . 

40  152  john.anderson@enron.com . 

40  11  monika.causholli@enron.com.. 

40  24  bill.williams.iii@enron.com. 

40  28279  dporter3@enron.com . 

40  92  holden.salisbury@enron.com.. 

40  89  greg.wolfe@enron.com . 

40  16  steven.merris@enron.com . 

40  28280  jbryson@enron.com . 

40  95  center.dl-portland@enron.com 


Name  p(z|u) 

Pete  Davis .  0.177489 

Mark  Guzman .  0.088616 

Ryan  Slinger .  0.088490 

Geir  Solberg .  0.088011 

Craig  Dean .  0.082880 

Bill  Williams  III...  0.056518 

Leaf  Harasin .  0.049016 

Bert  Meyers .  0.048908 

Eric  Linder .  0.043338 

.  0.039818 

Michael  Mier .  0.039728 

John  Anderson .  0.039724 

Monika  Causholli . . . .  0.039187 
bill .williams . iii .. .  0.030895 

.  0.030679 

Holden  Salisbury. . . .  0.015729 

Greg  Wolfe .  0.013466 

Steven  Merris .  0.008022 

.  0.004246 

DL-Portland  World  Tr  0.002170 


CATEGORY  41 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 


VERTICES:  870 

LARGEST  COMPONENT  SIZE:  803 
GROUP  DEGREE:  0.18464 
GROUP  CLOSENESS:  0.00124 
AVERAGE  p(z|u) :  0.35 

MOST  PROBABLE  USERS 


COMPONENTS:  15 

PERCENT  OF  TOTAL  GRAPH:  92.307, 
GRAPH  DENSITY:  0.00230 
GROUP  BETWEENNESS:  0.27743 
STDEV  p (z | u) :  0.39 
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Topic#  ID# 

41 

756 

41 

747 

41 

735 

41 

10240 

41 

1488 

41 

1559 

41 

1418 

41 

19787 

41 

655 

41 

8357 

41 

23910 

41 

1419 

41 

795 

41 

2780 

41 

710 

41 

4661 

41 

750 

41 

1543 

41 

4662 

41 

798 

Email  Address 

geof f . storeyOenron . com . 

s . . shively@enron .com . 

ina . rangelOenron .com . 

hunter . shivelyOenron . com . 

perf mgmt@enron . com . 

j  ohn . arnold@enron . com . 

lexi . elliott@enron . com . 

v . weldon@enron . com . 

karen . buckley@enron . com . 

airam . arteagaOenron . com . 

mary . f ischerOenron. com . 

billy . lemmonsOenron . com . 

john.griffith@enron.com . 

kim . melodick@enron . com . 

larry . may@enron . com . 

charlene.jackson@enron.com. . . 

j  eanie . slone@enron . com . 

richard . causey@enron . com . 

celeste.roberts@enron.com. . . . 
announcements . enron@enron . com 


Name 

Geoff  Storey . 

Hunter  S.  Shively... 
Ina  Rangel . 


"Performance  Evaluat 


Lexi  Elliott 


Karen  Buckley 


Mary  Fischer . 

Billy  Lemmons  Jr. . . . 
John  Griffith . 


Larry  May 


Jeanie  Slone 


Enron  General  Announ 


************************************************************************: 
CATEGORY  42 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  423  COMPONENTS:  7 

LARGEST  COMPONENT  SIZE:  382  PERCENT  OF  TOTAL  GRAPH:  90.31"/, 
GROUP  DEGREE:  0.26931  GRAPH  DENSITY:  0.00711 

GROUP  CLOSENESS:  0.00268  GROUP  BETWEENNESS:  0.35488 

AVERAGE  p(z | u) :  0.62  STDEVp(zlu):  0.42 


MOST  PROBABLE  USERS 

Topic#  ID#  Email  Address  Name 

42  642  eric.bass@enron.com .  Eric  Bass . 

42  703  matthew.lenhart@enron.com .  Matthew  Lenhart 

42  707  mike.maggi@enron.com .  Mike  Maggi . 

42  6992  judy.hernandez@enron.com .  Judy  Hernandez. 

42  8773  michelle.nelson@enron.com .  Michelle  Nelson 

42  1115  clint.dean@enron.com .  Clint  Dean . 

42  795  john.griffith@enron.com .  John  Griffith.. 

42  1878  bryan.hull@enron.com .  Bryan  Hull . 

42  706  m.  .  love@enron.  com .  Phillip  M.  Love 

42  678  c .  .  giron@enron.  com .  Darron  C.  Giron 


p(z|u) 

0.021761 

0.021107 

0.021059 

0.020921 

0.020366 

0.013471 

0.013286 

0.012157 

0.011389 

0.011050 

0.010572 

0.010545 

0.010305 

0.009985 

0.009911 

0.007626 

0.007531 

0.006660 

0.006151 

0.005602 

:****************************************** 


p(z|u) 

0.042792 

0.035787 

0.023799 

0.018926 

0.014069 

0.013477 

0.010839 

0.010542 

0.010374 

0.010239 
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42 

42 

42 

42 

42 

42 

42 

42 

42 

42 


15272  phillip.loveOeiiron.com .  Phillip  Love . 

14765  darron.gironOenron.com . 

8917  timothy .blanchardOenron . com . 

453  cooper.richeyOenron.com .  Cooper  Richey . 

15321  shanna.husserOenron.com .  Shanna  Husser . 

6702  chad.landryOenron.com .  Chad  Landry . 

6580  angela.barnettOenron.com .  Angela  Barnett . 

19886  leslie.smithOenron.com . 

742  amanda.rybarskiOenron.com .  Amanda  Rybarski . 

18727  regina.blackshearOenron.com .  Regina  Blackshear.  .  . 


0.009753 
0 . 009049 
0.008525 
0.008415 
0.008368 
0.006848 
0.005651 
0.005561 
0.005443 
0.005079 


****************************************************************************************************************** 
CATEGORY  43 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  1256  COMPONENTS:  3 

LARGEST  COMPONENT  SIZE:  1246  PERCENT  OF  TOTAL  GRAPH:  99. 20°/. 
GROUP  DEGREE:  0.28213  GRAPH  DENSITY:  0.00637 

GROUP  CLOSENESS:  0.00884  GROUP  BETWEENNESS:  0.26829 

AVERAGE  p(z | u) :  0.50  STDEV  p (z I u) :  0.39 


MOST  PROBABLE  USERS 


Topic#  ID# 

43 

15554 

43 

698 

43 

713 

43 

15921 

43 

706 

43 

558 

43 

14765 

43 

644 

43 

565 

43 

11079 

43 

612 

43 

15272 

43 

1777 

43 

1695 

43 

15318 

43 

718 

43 

14684 

43 

11108 

43 

604 

43 

679 

Email  Address 
daren.farmer@enron.com. . . . 

kam . keiser@enron .com . 

errol . mclaughlin@enron . com 

pat . clynes@enron .com . 

m . . love@enron . com . 

melba.lozano@enron.com. . . . 
darron.giron@enron.com. . . . 
david.baumbach@enron.com. . 
kevin.meredith@enron.com. . 
melissa.graves@enron.com. . 
chris.walker@enron.com. . . . 
phillip.love@enron.com. . . . 

rita . wynne@enron .com . 

torrey.moorer@enron.com. . . 

robert . cass@enron . com . 

bruce . mills@enron . com . 

robert.cotten@enron.com. . . 
julie.meyers@enron.com. . . . 
tara.sweitzer@enron.com. . . 
c. .gossett@enron.com . 


Name 


Kam  Keiser . 

Errol  McLaughlin  Jr. 


Phillip  M.  Love 
Melba  Lozano. . . 


David  Baumbach 
Kevin  Meredith 
Melissa  Graves 
Chris  Walker. . 
Phillip  Love . . 


Bruce  Mills . 

Robert  Cotten . 

Julie  Meyers . 

Tara  Sweitzer . 

Jeffrey  C.  Gossett.. 


p(z|u) 

0.045088 

0.021511 

0.017938 

0.012968 

0.012744 

0.010818 

0.010146 

0.009285 

0.009056 

0.008244 

0.007684 

0.007458 

0.007307 

0.006879 

0.006732 

0.006592 

0.006567 

0.006533 

0.006436 

0.005868 
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CATEGORY  44 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  420  COMPONENTS:  3 

LARGEST  COMPONENT  SIZE:  414  PERCENT  OF  TOTAL  GRAPH:  98.57'/, 
GROUP  DEGREE:  0.49411  GRAPH  DENSITY:  0.01909 

GROUP  CLOSENESS:  0.01826  GROUP  BETWEENNESS:  0.43733 

AVERAGE  p(z | u) :  0.49  STDEVp(zlu):  0.42 


MOST  PROBABLE  USERS 


Topic#  ID# 
44  5335 
44  1654 
44  2058 
44  29453 
44  4807 
44  226 
44  2106 
44  2109 
44  4310 
44  30143 
44  2102 
44  34375 
44  1666 
44  60189 
44  488 
44  1388 
44  6033 
44  20277 
44  6417 
44  1748 


Email  Address  Name 

vince . kaminskiaenr on .com . 

j . kaminski@enron .com . 

shirley.crenshawaenron.com .  Shirley  Crenshaw.... 

vkamins3enron .com . 

Stinson .  gibnerOenron .  com . 

don.baughmanaenron.com .  Don  Baughman  Jr . 

vasant.shanbhogueaenron.com .  Vasant  Shanbhogue .  .  . 

zimin.luaenron.com .  Zimin  Lu . 

ebass3enron.com . 

vince .  j  .  kaminskiaenr  on.  com .  Vince  J"  "Kaminski.. 

tanya.tamarchenko3enron.com .  Tanya  Tamarchenko .  .  . 

alewis3enron .  com . 

pinnamaneni .krishnarao3enron. com . 

pkeavey3ect . enr on .com . 

mike.carsonaenron.com .  Mike  Carson . 

christie.patrickaenron.com .  Christie  Patrick.... 

mike . roberts3enron .com . 

grant . masson3enron .com . 

kaminskiaenr on .com . 

dale . surbey3enr on .com . 


p(z|u) 

0.208832 

0.052962 

0.034859 

0.028588 

0.024029 

0.019995 

0.015230 

0.010503 

0.010253 

0.009942 

0.009657 

0.008964 

0.008216 

0.008159 

0.007894 

0.007204 

0.007103 

0.006883 

0.004629 

0.004280 


CATEGORY  45 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 


VERTICES:  363 

LARGEST  COMPONENT  SIZE:  351 
GROUP  DEGREE:  0.37839 
GROUP  CLOSENESS:  0.00827 
AVERAGE  p(z|u) :  0.75 

MOST  PROBABLE  USERS 


COMPONENTS:  4 

PERCENT  OF  TOTAL  GRAPH:  96.69'/, 
GRAPH  DENSITY:  0.01105 
GROUP  BETWEENNESS:  0.53580 
STDEV  p(z|u) :  0.37 
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Topic#  ID#  Email  Address 


Name 


45  1093 
45  9244 
45  1174 
45  1651 
45  29416 
45  154 
45  17542 
45  4638 
45  20027 
45  17261 
45  4940 
45  1568 
45  17208 
45  6899 
45  17434 
45  1460 
45  14935 
45  2279 
45  24438 
45  2386 


kay . mann@enron . com . 

richard.sanders@enron.com. . . . 

b .  . sanders@enron .com . 

ben . j  acoby@enron .com . 

richard . b . sanders@enron . com . . 

sheila . tweed@enron . com . 

roseann.engeldorf@enron.com. . 

j  ames . derrick@enron . com . 

kathleen.carnahan@enron.com. . 

carlos . sole@enron . com . 

rob . walls@enron . com . 

lisa.bills@enron.com . 

chris . booth@enron . com . 

f red . mitro@enron .com . 

britt . davis@enron . com . 

c .  . williams@enron . com . 

susan . scott@enron . com . 

andrew . edison@enron . com . 

j  ohn . schwartzenburg@enron . com 
heather . kroll@enron . com . 


p(z|u) 


Kay  Mann .  0.130320 

.  0.036335 

Richard  B.  Sanders..  0.020703 


.  0.020693 

.  0.015770 

Sheila  Tweed .  0.014446 

.  0.011599 

.  0.011240 

.  0.010589 

.  0.010262 

.  0.008472 

.  0.008230 

.  0.007672 


Fred  Mitro .  0.007671 

.  0.007496 

Robert  C.  Williams..  0.006835 

.  0.006371 

Andrew  Edison .  0.006344 

.  0.006127 

Heather  Kroll .  0.006031 


******************************************************************************************************************* 
CATEGORY  46 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 
VERTICES:  978 

LARGEST  COMPONENT  SIZE:  978 
GROUP  DEGREE:  0.61492 
GROUP  CLOSENESS:  0.266322 
AVERAGE  p(z|u) :  0.35 


COMPONENTS:  1 

PERCENT  OF  TOTAL  GRAPH:  100% 
GRAPH  DENSITY:  0.00716 
GROUP  BETWEENNESS:  0.23861 
STDEV  p(zlu) :  0.38 


MOST  PROBABLE  USERS 

Topic#  ID#  Email  Address  Name 

46  3457  gary.hickerson@enron.com . 

46  1657  jeff.kinneman@enron.com . 

46  543  robert.johnston@enron.com .  Robert  Johnston 

46  3659  scott.tholan@enron.com . 

46  279  frank.hayden@enron.com .  Frank  Hayden... 

46  14653  stephen.stock@enron.com . 

46  366  zhiyong.wei@enron.com .  Zhiyong  Wei.... 

46  293  louise.kitchen@enron.com .  Louise  Kitchen. 

46  2235  beth.perlman@enron.com .  Beth  Perlman... 

46  3462  eric.gonzales@enron.com . 


p(z|u) 

0.025693 

0.019869 

0.017357 

0.014595 

0.013467 

0.012490 

0.011913 

0.010144 

0.009677 

0.009453 
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46 

46 

46 

46 

46 

46 

46 

46 

46 

46 


756  geoff.storey@enron.com .  Geoff  Storey . 

1737  per.sekse@enron.com . 

1627  john.greene@enron.com . 

1571  mlchael.bradley@enron.com . 

3475  bryan.seyfried@enron.com . 

2202  a.  .  shankman@enron .  com .  Jeffrey  A.  Shankman. 

23304  michelle  .  cisnerosSenron .  com . 

1999  jaime.gualy@enron.com .  Jaime  Gnaly . 

3535  markus.fiala@enron.com .  Markus  Fiala . 

1711  paul.pizzolato@enron.com . 


0.008877 

0.008811 

0.008799 

0.008786 

0.008119 

0.008106 

0.007940 

0.007823 

0.007595 

0.007570 


***************************************************************************************************************** 
CATEGORY  47 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  237  COMPONENTS:  1 

LARGEST  COMPONENT  SIZE:  237  PERCENT  OF  TOTAL  GRAPH:  100°/, 
GROUP  DEGREE:  0.29692  GRAPH  DENSITY:  0.02119 

GROUP  CLOSENESS:  0.16754  GROUP  BETWEENNESS:  0.29250 

AVERAGE  p(z | u) :  0.32  STDEV  p (z I u) :  0.38 


MOST  PROBABLE  USERS 


Topic#  ID# 

47 

155 

47 

4135 

47 

206 

47 

124 

47 

14696 

47 

481 

47 

590 

47 

144 

47 

4854 

47 

9244 

47 

329 

47 

318 

47 

2318 

47 

1786 

47 

347 

47 

58 

47 

15296 

47 

18647 

47 

1090 

47 
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Email  Address 
elizabeth . sager@enron . com . 
mark.haedicke@enron.com. . . 
Christian . yoder@enron . com . 
chris.stokley@enron.com. . . 
lisa . mellencamp@enron . com . 

s . . bradf  ord@enron . com . 

edward.sacks@enron.com. . . . 

tracy . ngo@enron . com . 

william.bradford@enron.com 
richard . sanders@enron . com . 

david . portz@enron . com . 

marcus . nettelton@enron . com 

vicki . sharp@enron . com . 

michael . tribolet@enron . com 

d. .steffes@enron.com . 

p . . o  ’ neil@enron . com . 

jeffrey.hodge@enron.com. . . 

carol . clair@enron . com . 

harlan.murphy@enron.com. . . 
genia . f itzgerald@enron . com 


Name 

Elizabeth  Sager 


Chris  Stokley . 

Lisa  Mellencamp . 

William  S.  Bradford. 
Edward  Sacks . 


David  Portz . 

Marcus  Nettelton. . . . 
Vicki  Sharp . 


James  D.  Steffes.... 
Murray  P.  Neil . 


Harlan  Murphy . 

Genia  Fitzgerald. . . . 


p(z|u) 

0.139767 

0.066912 

0.038500 

0.036201 

0.029949 

0.026618 

0.025208 

0.022714 

0.022092 

0.020364 

0.019284 

0.019239 

0.019177 

0.018787 

0.017370 

0.016995 

0.016055 

0.013327 

0.010520 

0.009790 
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B.4  Author  Topic  with  all  Words  (No  Dictionary) 


CATEGORY  0 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  2635  COMPONENTS:  9 

LARGEST  COMPONENT  SIZE:  2606  PERCENT  OF  TOTAL  GRAPH:  98.90'/. 
GROUP  DEGREE:  0.17486  GRAPH  DENSITY:  0.00114 

GROUP  CLOSENESS:  0.00173  GROUP  BETWEENNESS:  0.28903 

AVERAGE  p(z | u) :  0.13  STDEVp(zlu):  0.25 


MOST  PROBABLE  USERS 


Topic#  ID# 
0  256 

0  5335 

0  2222 

0  3132 

0  36 

0  1651 

0  3100 

0  8308 

0  37 

0  1688 

0  813 

0  403 

0  818 

0  168 

0  7158 

0  1474 

0  812 

0  401 

0  2810 

0  154 


Email  Address 

pete . davis@enron . com . 

vince.kaminski@enron.com. . . . 
harry.kingerski@enron.com. . . 

j  ames . wright@enron . com . 

alan . comnes@enron . com . 

ben . j  acoby@enron .com . 

alan.aronowitz@enron.com. . . . 

steven . harris@enron . com . 

tim . belden@enron . com . 

ed . mcmichael@enron . com . 

sarah . novosel@enron . com . 

david.forster@enron.com . 

linda.robertson@enron.com. . . 
Christopher . calger@enron . com 
mark.greenberg@enron.com. . . . 

joe.hartsoe@enron.com . 

1. .nicolay@enron.com . 

bob . shults@enron . com . 

larry.campbell@enron.com. . . . 
sheila . tweed@enron . com . 


Name 

Pete  Davis 


Harry  Kingerski 


Alan  Comnes 


Steven  Harris 
Tim  Belden. . . 


Linda  Robertson 


Mark  Greenberg 


Bob  Shults 


Sheila  Tweed 


p(z|u) 

0.000224689115565 

0.000224629534140 

0.000224151636700 

0.000224056885215 

0.000223919554515 

0.000223898827630 

0.000223896134608 

0.000223838281534 

0.000223834623853 

0.000223694291968 

0.000223666544479 

0.000223605891197 

0.000223533527968 

0.000223458217152 

0.000223410680637 

0.000223400873207 

0.000223341104209 

0.000223222289714 

0.000223194609193 

0.000223119660127 


CATEGORY  1 

EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  2709  COMPONENTS:  15 

LARGEST  COMPONENT  SIZE:  2654  PERCENT  OF  TOTAL  GRAPH:  97. 97°/. 
GROUP  DEGREE:  0.18232  GRAPH  DENSITY:  0.00258 

GROUP  CLOSENESS:  0.00075  GROUP  BETWEENNESS:  0.17932 

AVERAGE  p(z | u) :  0.23  STDEVp(zlu):  0.04 
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MOST  PROBABLE  USERS 


Topic#  ID# 
1  19980 

1  18664 

1  37910 

1  6265 

1  2995 

1  18180 

1  17783 

1  1094 

1  18181 

1  81490 

1  29291 

1  250 

1  2186 

1  8876 

1  19629 

1  18031 

1  2993 

1  19627 

1  1202 

1  17577 


Email  Address  Name 

harry . collins@enron . com . 

brian.lindsay@enron.com .  Brian  Lindsay 

amy . hef  f  ernan@enr on .com . 

jana.morse@enron.com .  Jana  Morse... 

juana.fayett@enron.com .  Juana  Fayett. 

sonya . clarke@enron . com . 

paul.maley@enron.com .  Paul  Maley.  .  . 

karen.  o  }day@enron .  com .  Karen  day.... 

tim . davies@enron . com . 

ngo@enron . com . 

.cooper@enron.com .  ebs . 

cynthia .  clark@enron .  com .  Cynthia  Clark 

robbi.rossi@enron.com .  Robbi  Rossi.. 


tandra.coleman@enron.com .  Tandra  Coleman 

albert . escamilla@enron . com . 

lesli.campbell@enron.com .  Lesli  Campbell 

trang.le@enron.com .  Trang  Le . 

julie . brewer@enron . com . 

center . eol@enron .com . 

alison.keogh@enron.com .  Alison  Keogh.. 


p(z|u) 

0.001437257393948 

0.001352654135122 

0.001314934606771 

0.001290984787760 

0.001269252153405 

0.001255208014767 

0.001202681298278 

0.001197873773149 

0.001187950996886 

0.001162785669110 

0.001161458455270 

0.001157936702115 

0.001142328065856 

0.001137397289212 

0.001124752596785 

0.001110176158394 

0.001090336692365 

0.001087762015664 

0.001084833158682 

0.001066101166139 


CATEGORY  2 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  2613  COMPONENTS:  4 

LARGEST  COMPONENT  SIZE:  2606  PERCENT  OF  TOTAL  GRAPH:  99.73'/, 
GROUP  DEGREE:  0.16524  GRAPH  DENSITY:  0.00230 

GROUP  CLOSENESS:  0.01478  GROUP  BETWEENNESS:  0.17925 

AVERAGE  p(z | u) :  0.02  STDEVp(zlu):  0.03 


MOST  PROBABLE  USERS 


Topic#  ID#  Email  Address 


Name 


p(z|u) 


2 

2 

2 

2 

2 

2 

2 

2 


6815  debra.perlingiere@enron.com .  Debra  Perlingiere .  .  . 

56883  robin.deckerSenron.com . 

56873  elizabeth . serralheiroSenron . com . 


29433  legal.4Senron.com . 

81683  james.canneySenron.com .  James  Canney.. 

15231  joanne.rozyckiSenron.com .  Joanne  Rozycki 

20033  kaye.ellisSenron.com . 

7476  andrew.ralstonSenron.com .  Andrew  Ralston 


0.001219539848980 

0.001117162899362 

0.000851194436798 

0.000796814254506 

0.000726944434387 

0.000723907351742 

0.000695117396012 

0.000685078306837 
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2 


19802  sheri.cromwell@enron.com 


0.000665121467956 


2 

2 

2 

2 

2 

2 

2 

2 

2 

2 

2 


56908  lina.jimenez@enron.com . 

12226  carolyn.george@enron.com . 

29409  ned.crady@enron.com . 

15265  paul.burgener@enron.com . 

15246  margo.terrell@enron.com . 

14777  kay.young@enron.com . 

42425  allison.mchenry@enron.com . 

41717  brenda.l.funk@enron.com .  Brenda  L. 

17585  j . . simmons@enron. com . 

19054  jorge.garcia@enron.com . 

1093  kay.mann@enron.com .  Kay  Mann. 


.  0.000660426109974 

.  0.000624006796842 

.  0.000595200797749 

.  0.000572896648689 

.  0.000537901138048 

.  0.000529420421996 

.  0.000519970465448 

"Funk _  0.000508707286008 

.  0.000504685196113 

.  0.000494286468186 

.  0.000494261619411 


***************************************************************************************************************** 
CATEGORY  3 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  4698  COMPONENTS:  6 

LARGEST  COMPONENT  SIZE:  4685  PERCENT  OF  TOTAL  GRAPH:  99.72% 
GROUP  DEGREE:  0.07049  GRAPH  DENSITY:  0.00170 

GROUP  CLOSENESS:  0.00510  GROUP  BETWEENNESS:  0.06960 

AVERAGE  p(z | u) :  0.02  STDEVp(zlu):  0.04 


MOST  PROBABLE  USERS 


Topic#  ID# 
3  2363 

3  718 

3  594 

3  16342 

3  6752 

3  15536 

3  4320 

3  6579 

3  23522 

3  237 

3  353 

3  6683 

3  655 

3  2501 

3  64095 

3  14765 

3  16333 

3  63997 

3  20686 


Email  Address 

morris . larubbio@enron . com . . 

bruce . mills@enron . com . 

amanda.schultz@enron.com. . . 

amy . ochoa@enron . com . 

chance . rabon@enron . com . 

j  ay . smith@enron . com . 

erwin.landivar@enron.com. . . 
andres . balmaceda@enron . com . 
j  ohn . o 1 conner@enron . com .... 
charles.brewer@enron.com. . . 

mark . symms@enron .com . 

car ole . f rank@enron . com . 

karen.buckley@enron.com. . . . 
kimberly.bates@enron.com. . . 

devries@enron . com . 

darron . giron@enron . com . 

enw-employees@enron.com. . . . 

lagrasta@enron . com . 

rosalinda . zermeno@enron . com 


Name 

Morris  Larubbio 
Bruce  Mills .... 
Amanda  Schultz. 

Amy  Ochoa . 

Chance  Rabon. . . 
Jay  Smith . 


Andres  Balmaceda. . . . 

John  Conner . 

Charles  Brewer . 

Mark  Symms . 

Carole  Frank . 

Karen  Buckley . 

Kimberly  Bates . 


Rosalinda  Zermeno. . . 


p(z|u) 

0.001020780948689 

0.000980615289663 

0.000949058445429 

0.000948042922951 

0.000903863120773 

0.000872105001813 

0.000850443826055 

0.000839203276462 

0.000822004114002 

0.000803686250356 

0.000800571151252 

0.000792833595108 

0.000778043460693 

0.000750125656713 

0.000740211436400 

0.000733296006186 

0.000729033912227 

0.000716721930510 

0.000693197947004 
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3 


7163  holly.heath@enron.com 


Holly  Heath 


0.000688478749333 


CATEGORY  4 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  3157  COMPONENTS:  9 

LARGEST  COMPONENT  SIZE:  3136  PERCENT  OF  TOTAL  GRAPH:  99.33% 
GROUP  DEGREE:  0.09114  GRAPH  DENSITY:  0.00127 

GROUP  CLOSENESS:  0.00289  GROUP  BETWEENNESS:  0.19926 
AVERAGE  p(z I u) :  0.02  STDEVp(zlu):  0.02 


MOST  PROBABLE  USERS 


Topic#  ID#  Email  Address 


Name 


p(z|u) 


4 

4 

4 

4 

4 

4 

4 

4 

4 

4 

4 

4 

4 

4 

4 

4 

4 

4 

4 

4 


2505  mara.bronstein@enron.com. 
20150  philip.polsky@enron.com.. 

773  .ward@enron.com . 

14660  theresa.staab@enron.com.. 
18754  valerie.vela@enron.com... 

465  dipak.agarwalla@enron.com 

3605  t . . lucci@enron. com . 

52900  ’williams@enron.com . 

8819  d.  .mcilvoy@enron.  com . 

715  shelly.mendel@enron.com.. 
6635  arvel.martin@enron.com... 

625  p . . adams@enron. com . 

40426  s . . ward@enron .com . 

32548  john.kiani@enron.com . 

1769  mark.whitt@enron.com . 

19098  amy.felling@enron.com.... 
719  1 . .mims@enron. com . 

774  charles.weldon@enron.com. 
338  bryce.schneider@enron.com 

43902  chike . okparaSenron . com . . . 


Mara  Bronstein 


houston 


Valerie  Vela. . . 
Dipak  Agarwalla 
Paul  T.  Lucci.. 


Karen  D.  Mcllvoy. . . . 

Shelly  Mendel . 

Arvel  Martin . 

Jacqueline  P.  Adams. 
Kim  S.  Ward . 


Amy  Felling . 

Patrice  L.  Mims . 

V.  Charles  Weldon... 

Bryce  Schneider . 

Chike  Okpara . 


0.001187997088076 

0.000818567856795 

0.000751953743619 

0.000725819772445 

0.000654024597874 

0.000612009047108 

0.000536658305317 

0.000527977577506 

0.000515260857618 

0.000509582900182 

0.000477629632255 

0.000464865109127 

0.000460023685377 

0.000427630112348 

0.000401837665184 

0.000396195180604 

0.000386317255399 

0.000373450799570 

0.000373107573232 

0.000362428886254 


CATEGORY  5 

EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  2385  COMPONENTS:  13 

LARGEST  COMPONENT  SIZE:  2343  PERCENT  OF  TOTAL  GRAPH:  98.24% 
GROUP  DEGREE:  0.20805  GRAPH  DENSITY:  0.00126 

GROUP  CLOSENESS:  0.00107  GROUP  BETWEENNESS:  0.37902 

AVERAGE  p(z | u) :  0.02  STDEVp(zlu):  0.02 
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MOST  PROBABLE  USERS 


Topic#  ID# 
5  1175 

5  41760 

5  311 

5  20117 

5  360 

5  14566 

5  15555 

5  14602 

5  39760 

5  9320 

5  131 

5  22380 

5  41075 

5  17260 

5  32858 

5  694 

5  30857 

5  6784 

5  1332 

5  14798 


Email  Address 

arsyst emOmailman . enron. com . 

matt . dawsonSenr on .com . 

hal . mckinneySenron .com . 

arsystemSect . enron. com . 

wayne . vinsonSenron .com . 

approval . eol . gas . tradersSenron . com 

neal . d . winf reeSenron . com . 

kay . classenSenron . com . 

f letcher .  j  . sturm3enron.com . 

inf ormation.managementSenron. com. . 

maria. vanSenron. com . 

m . . hallSenron .com . 

daemon . extraSenron .com . 

shemeika . landrySenron . com . 

arsystemSenron. com . 

brad.jonesSenron.com . 

erequestSenron. com . 

daryll . f uentesSenron . com . 

michael . kassSenron.com . 

sonya . j  ohnsonSenron . com . 


Name  p(z|u) 

ARSystem .  0.001574130836517 

.  0.001341456308938 

Hal  McKinney .  0.001330269634127 

.  0.001228208602414 

Donald  Wayne  Vinson.  0.001210361972996 

.  0.001159613137693 

.  0.000952146810863 

Kay  Classen .  0.000916469529903 

"f letcher . j . sturmSen  0 . 000907536017550 

.  0.000636907455690 

Maria  Van  houten. . . .  0.000610300955835 

Bob  M.  Hall .  0.000516527037409 

EXTRA  Mailer  Daemon.  0.000496930219667 

.  0.000436118753118 

.  0.000419123791477 

Brad  Jones .  0.000359644926769 

.  0.000358074536540 

Daryll  Fuentes .  0.000339869838226 

Michael  Kass .  0.000312726975176 

Sonya  Johnson .  0.000278234712176 


CATEGORY  6 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  2856  COMPONENTS:  9 

LARGEST  COMPONENT  SIZE:  2819  PERCENT  OF  TOTAL  GRAPH:  98.70'/, 
GROUP  DEGREE:  0.05589  GRAPH  DENSITY:  0.00140 

GROUP  CLOSENESS:  0.00135  GROUP  BETWEENNESS:  0.12907 

AVERAGE  p(z | u) :  0.02  STDEVp(zlu):  0.02 


MOST  PROBABLE  USERS 

Topic#  ID#  Email  Address  Name 

6  427  chris.dorlandSenron.com .  Chris  Dorland. 

6  453  cooper.richeySenron.com .  Cooper  Richey. 

6  38243  jr.martinezSenron.com . 

6  13416  gerri.gosnellSenron.com .  Gerri  Gosnell. 

6  712  jonathan.mckaySenron.com .  Jonathan  Mckay 

6  35820  greg.frersSenron.com .  Greg  Frers.... 

6  80  john.zufferliSenron.com . 

6  6765  carlos.torresSenron.com .  Carlos  Torres. 


p(z|u) 

0.000829621769665 

0.000769113638628 

0.000748348974864 

0.000747754316165 

0.000744392012942 

0.000741072769627 

0.000726454459572 

0.000691880065337 
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6  462  ryan.watt@enron.com .  Ryan  Watt . 

6  21948  michelle.wells@enron.com .  Michelle  Wells . 

6  37969  molnar.mark@enron.com . 

6  1189  dl-ga-canada_calgary@enron.com .  DL-GA-Canada_Calgary 

6  734  dutch.quigley@enron.com .  Dutch  Quigley . 

6  795  john.griffith@enron.com .  John  Griffith . 

6  3617  chris.cramer@enron.com .  Chris  Cramer . 

6  444  angela.mcculloch@enron.com .  Angela  McCulloch.... 

6  35828  greg.mann@enron.com .  Greg  Mann . 

6  38598  f.wong@enron.com .  Michael  F  Wong . 

6  63973  milnthorp@enron.com . 

6  2563  brad.mckay@enron.com .  Brad  Mckay . 

************************************************************************: 
CATEGORY  7 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  6623  COMPONENTS:  12 

LARGEST  COMPONENT  SIZE:  6572  PERCENT  OF  TOTAL  GRAPH:  99.23"/, 
GROUP  DEGREE:  0.08334  GRAPH  DENSITY:  0.00106 

GROUP  CLOSENESS:  0.00060  GROUP  BETWEENNESS:  0.09969 

AVERAGE  p(z | u) :  0.02  STDEVp(zlu):  0.02 


MOST  PROBABLE  USERS 


Topic#  ID# 
7  1017 

7  84535 

7  6998 

7  59116 

7  21830 

7  11307 

7  2352 

7  438 

7  443 

7  35738 

7  21829 

7  844 

7  2380 

7  58211 

7  49422 

7  53751 

7  26919 

7  194 

7  8792 


Email  Address 

settlements . eesOenron. com . 

rringOees . enron. com . 

jeffrey. jacksonSenron. com . 

weekly .  reportOenr on .  com . 

paul . rodgerSenr on .com . 

jebong.leeOenron.com . 

f abian. taylorOenron. com . 

sean . lalaniOenron . com . 

mike . macpheeOenron .com . 

penny .mccarranOenron . com . 

paul . dunsmoreOenr on . com . 

robert’ . ’harshbarger@enron.com 

tom . duttaOenron .com . 

massage . ther apyOenron .com . 

scott . kauf fmanSenron . com . 

vicki .bergOenron. com . 

j  anet . bowersOenron .com . 

julie . sarnowskiOenron. com . 

sitaraOenron. com . 


Name 

EES  Power  Settlement 

NYISO  TIE  List . 

Jeffrey  Jackson . 


Paul  Rodger 


Fabian  Taylor . 

Sean  Lalani . 

Mike  Macphee . 

Penny  McCarran . 

Paul  Dunsmore . 

Robert  Harshbarger. . 
Tom  Dutta . 


Vicki  Berg . . 
Janet  Bowers 


Sitara 


0.000612055728614 

0.000577940195825 

0.000561595583295 

0.000556375086526 

0.000526694264641 

0.000524559329619 

0.000523267256669 

0.000520137853778 

0.000511308688617 

0.000478494144063 

0.000478287851721 

0.000444208485577 

****************************************** 


p(z|u) 

0.001409769839186 

0.001169214415918 

0.000963320018556 

0.000826629825354 

0.000781719530528 

0.000724479013296 

0.000711825762056 

0.000639518998844 

0.000591816044036 

0.000565921355502 

0.000556405798017 

0.000549804819768 

0.000540184041164 

0.000533279294119 

0.000524067551177 

0.000512397293925 

0.000478241192143 

0.000472234530220 

0.000407644180612 
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7 


324  juan.padron@enron.com 


Juan  Padron 


0.000407453318653 


CATEGORY  8 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  237  COMPONENTS:  1 

LARGEST  COMPONENT  SIZE:  237  PERCENT  OF  TOTAL  GRAPH:  100% 
GROUP  DEGREE:  0.11467  GRAPH  DENSITY:  0.00083 

GROUP  CLOSENESS:  0.00053  GROUP  BETWEENNESS:  0.09985 
AVERAGE  p(z I u) :  0.02  STDEVp(zlu):  0.03 


MOST  PROBABLE  USERS 


Topic#  ID# 
8  1894 

8  2368 

8  1914 

8  54525 

8  44086 

8  65258 

8  58495 

8  1895 

8  24446 

8  4762 

8  4884 

8  630 

8  21296 

8  58487 

8  41153 

8  3609 

8  18400 

8  13404 

8  20456 

8  5688 


Email  Address  Name 

chairman.enron@enron.com .  Enron  Office  Of  The 

coo.jeff@enron.com .  Jeff  McMahon  -  Presi 

chairman.ken@enron.com .  Ken  Lay  -  Office  of 

dl-ga-all_egs@enron .  com .  DL-GA-all_egs . 

enron . operations@enron.com . 

nate . ellis@enron. com . 

barbara. taylor@enron . com . 

dl-ga-all_enron_worldwide2@enron.com.  .  .  .  DL-GA-all_enron_worl 

executive.office@enron.com .  Office  of  the  Chief 

office . chairman@enron. com . 

enron .  wor  ldwide@enron  .com . 

jason.althaus@enron.com .  Jason  Althaus . 

money . in . motion@mailman . enron . com . 

mhaedic@ect.enron.com .  "Mark  Haedicke  J.D  " 

esa_employees@enr  on .  com . 

milagros.velasquez@enron.com .  Milagros  Velasquez.. 

bill . gulyassy@enron . com . 

deane.pierce@enron.com .  Deane  Pierce . 

peter . ghavami@enron. com . 

ken .  skilling@enron  .com . 


p(z|u) 

0.000927562299313 

0.000914154964422 

0.000909037256528 

0.000712308411108 

0.000678703259497 

0.000674942026342 

0.000647457698254 

0.000619311533344 

0.000597890616624 

0.000593806777944 

0.000591586827695 

0.000584571775075 

0.000583867882416 

0.000580002758654 

0.000577065975487 

0.000576277848309 

0.000569971766168 

0.000564355366972 

0.000556176469265 

0.000552705327190 


CATEGORY  9 

EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  3091  COMPONENTS:  8 

LARGEST  COMPONENT  SIZE:  3075  PERCENT  OF  TOTAL  GRAPH:  99.48'/, 
GROUP  DEGREE:  0.08164  GRAPH  DENSITY:  0.00097 

GROUP  CLOSENESS:  0.00465  GROUP  BETWEENNESS:  0.23917 

AVERAGE  p(z | u) :  0.02  STDEVp(zlu):  0.02 
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MOST  PROBABLE  USERS 


Topic#  ID# 
9  8773 

9  742 

9  707 

9  12142 

9  14652 

9  532 

9  20547 

9  2185 

9  14874 

9  8988 

9  83584 

9  7619 

9  2603 

9  243 

9  2609 

9  1810 

9  663 

9  7661 

9  21096 

9  8400 


Email  Address 

michelle . nelsonOenron . com 

amanda . rybarski@enron . com 

mike . maggiOenron .com . 

adriana.wynn@enron.com. . . 
amanda.huble@enron.com. . . 
reginald.hart@enron.com. . 
lance . j  ameson@enron . com . . 
michelle . hicks@enron . com . 
reginald . smith@enron . com . 

s . . presas@enron . com . 

adam . giannone@enron . com . . 
vikram.singh@enron.com. . . 
julia.sudduth@enron.com. . 
jim.cashion@enron.com. . . . 
alex . villarreal@enron . com 
wendy . f incher@enron . com . . 
julie.clyatt@enron.com. . . 
marc . graubart@enron . com . . 
rita.houston@enron.com. . . 
robin.veariel@enron.com. . 


Name 

Michelle  Nelson 
Amanda  Rybarski 

Mike  Maggi . 

Adriana  Wynn. . . 
Amanda  Huble . . . 
Reginald  Hart . . 


Michelle  Hicks 


Gracie  S.  Presas.... 

Adam  Giannone . 

Vikram  Singh . 

Julia  Sudduth . 

Jim  Cashion . 

Alex  Villarreal . 

Wendy  Fincher . 

Julie  Clyatt . 

Marc  Graubart . 

Rita  Houston . 

Robin  Veariel . 


p(z|u) 

0.001507114155269 

0.001394458259131 

0.001278302835769 

0.000739838987366 

0.000642563356775 

0.000636878764667 

0.000555991530417 

0.000426219171297 

0.000419969124568 

0.000406706548263 

0.000329105902344 

0.000312798538870 

0.000302450459812 

0.000298734579388 

0.000298599782582 

0.000297432909315 

0.000294151169298 

0.000287774474574 

0.000284827677990 

0.000274705443903 


CATEGORY  10 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  9232  COMPONENTS:  16 

LARGEST  COMPONENT  SIZE:  9194  PERCENT  OF  TOTAL  GRAPH:  99. 59'/. 
GROUP  DEGREE:  0.16899  GRAPH  DENSITY:  0.00065 

GROUP  CLOSENESS:  0.00087  GROUP  BETWEENNESS:  0.25984 

AVERAGE  p(z | u) :  0.02  STDEVp(zlu):  0.03 


MOST  PROBABLE  USERS 


Topic#  ID# 
10  7667 
10  797 
10  2985 
10  3652 
10  11479 
10  6180 
10  6520 
10  3602 


Email  Address  Name 

ipayitOenron  .com .  iPay  itOEnron .  com>@EN 

sap_security@enron .com . 

enron.payroll@enron.  com .  "Enron.  PayrollOenron 

payroll.enronOenron.com .  Enron  Payroll . 

ibuyit .  payablesOenron  .com .  iBuy  it .  PayablesOEnr  o 

enr on . expertf inderOenron .com . 

payables.ibuyitOenron.com .  iBuyit  Payables . 

r obert . j  onesOmailman . enron .com . 


p(z|u) 

0.001408019131315 

0.001376348794641 

0.001370844057360 

0.001339999077432 

0.001322737866577 

0.001277069974023 

0.001224519134846 

0.001206765097339 
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10 


18326  mbx_iscinfra@enron.com 


0.001114268782204 


10  18319  tahnee.stall@enron.com . 

10  1502  ic@enron.com .  "ic@enron.com" . 

10  28467  resources@enron.com .  "human  resources@enr 

10  1187  confirmit@enron.com .  Confirmit . 

10  18318  tammy.marcontell@enron.com . 

10  21165  carolyn.graham@enron.com .  Carolyn  Graham . 

10  464  sunil.abraham@enron.com .  Sunil  Abraham . 

10  28781  enronanywhere@enron.com .  "enronanywhere@enron 

10  17156  clickathomepilot3@enron.com .  "ClickAtHomePilot3@e 

10  8342  ibuyit@enron.com . 

10  34952  isc.groups@enron.com . 

************************************************************************: 
CATEGORY  11 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  1966  COMPONENTS:  14 

LARGEST  COMPONENT  SIZE:  1934  PERCENT  OF  TOTAL  GRAPH:  98. 37°/. 
GROUP  DEGREE:  0.10475  GRAPH  DENSITY:  0.00305 

GROUP  CLOSENESS:  0.00176  GROUP  BETWEENNESS:  0.18861 

AVERAGE  p(z | u) :  0.02  STDEVp(zlu):  0.03 


MOST  PROBABLE  USERS 


Topic# 

:  ID# 

11 

566 

11 

1056 

11 

582 

11 

21 

11 

138 

11 

796 

11 

20 

11 

126 

11 

127 

11 

40 

11 

490 

11 

2009 

11 

1051 

11 

1953 

11 

219 

11 

981 

11 

17 

11 

620 

11 

972 

Email  Address 

evelyn . metoyer@enron . com . 

kerri . thompson@enron . com . 

stephanie.piwetz@enron.com. . . . 

kate . symes@enron .com . 

kysa . alport@enron . com . 

shift.dl-portland@enron.com. . . 

geir . solberg@enron . com . 

theresa . villeggiante@enron . com 

j  osie . j  arnagin@enron . com . 

kara . ausenhus@enron . com . 

sharen . cason@enron . com . 

alexander . mcelreath@enron . com . 

j  udy . dyer@enron .com . 

kayla . harmon@enron . com . 

michael . mier@enron . com . 

Portland . shif t@enron . com . 

bert . meyers@enron . com . 

ryan . williams@enron . com . 

shift . portland@enron . com . 


Name 

Evelyn  Metoyer . 

Kerri  Thompson . 

Stephanie  Piwetz. . . . 
Kate  Symes . 


DL-Portland  Real  Tim 
Geir  Solberg . 


Kara  Ausenhus . 

Sharen  Cason . 

Alexander  McElreath. 

Judy  Dyer . 

Kayla  Harmon . 

Michael  Mier . 

Portland  Shift . 

Bert  Meyers . 

Ryan  Williams . 

Portland  Shift . 


0.001110400348778 

0.001050188442557 

0.001014836571422 

0.000989726115596 

0.000952825592129 

0.000823777057485 

0.000815470882058 

0.000769680891276 

0.000756431101787 

0.000738355106937 

0.000650148600542 

****************************************** 


p(z|u) 

0.001526876706561 

0.001495630223664 

0.001246524881606 

0.001071523355340 

0.000970956469983 

0.000959432896811 

0.000837022274781 

0.000823167032216 

0.000808053992738 

0.000804592961174 

0.000729561254883 

0.000718309066716 

0.000717313008419 

0.000692622393449 

0.000658581398353 

0.000656633930748 

0.000647197008757 

0.000638369003422 

0.000621784981891 
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11 


123  chris.nmmm@enron.com 


Chris  Mumm 


0.000620099069163 


CATEGORY  12 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  4630  COMPONENTS:  13 

LARGEST  COMPONENT  SIZE:  4578  PERCENT  OF  TOTAL  GRAPH:  98.88'/, 
GROUP  DEGREE:  0.09902  GRAPH  DENSITY:  0.00108 

GROUP  CLOSENESS:  0.0068  GROUP  BETWEENNESS:  0.15957 
AVERAGE  p(z I u) :  0.02  STDEVp(zlu):  0.02 


MOST  PROBABLE  USERS 


Topic#  ID# 

12 

25328 

12 

51741 

12 

17749 

12 

80962 

12 

41651 

12 

41578 

12 

80963 

12 

83451 

12 

81353 

12 

82077 

12 

17623 

12 

82078 

12 

30929 

12 

2232 

12 

35852 

12 

43924 

12 

4370 

12 

6344 

12 

6351 

12 

58471 

Email  Address 
tim.mckone@enron.com. . . . 
ramon . alvarez@enron . com . 
martin.smith@enron.com. . 

mlawles@enron .com . 

tlehan@enron.com . 

ccheekSenron . com . 

staci_holtzman@enron . com 
kevin .  d .  j  ordanSenr on .  com 
jazayeri .peterSenron . com 

’bump@enron.  com . 

duncan.dave@enron.com. . . 

dbump@ect . enron .com . 

john.hopleySenron.com. . . 
david. reinf eldSenron . com 

’ rogers@enron .com . 

f ilterpstSenron .com . 

hcubill@enron .com . 

erica.harris@enron.com. . 
f  ernando . parraSenr on . com 
nicola . sandersSenron . com 


Name 


Martin  Smith 


Dave  Duncan 


David  Reinfeld 


FILTERPST 


p(z|u) 

0.001112582690605 

0.000875863487561 

0.000661615718698 

0.000552769019784 

0.000543875920904 

0.000540950394832 

0.000537241130249 

0.000525195133049 

0.000486844092153 

0.000474503349677 

0.000458159956047 

0.000458070562629 

0.000441690483965 

0.000439909549706 

0.000439313499888 

0.000438781215502 

0.000417858611289 

0.000414267975706 

0.000411151153344 

0.000397580267494 


CATEGORY  13 

EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  6933  COMPONENTS:  18 

LARGEST  COMPONENT  SIZE:  6873  PERCENT  OF  TOTAL  GRAPH:  99.13'/, 
GROUP  DEGREE:  0.08059  GRAPH  DENSITY:  0.00087 

GROUP  CLOSENESS:  0.00050  GROUP  BETWEENNESS:  0.14974 

AVERAGE  p(z | u) :  0.02  STDEVp(zlu):  0.01 
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MOST  PROBABLE  USERS 


Topic#  ID# 

13 

4110 

13 

14112 

13 

3156 

13 

10603 

13 

21872 

13 

14521 

13 

1911 

13 

12993 

13 

12940 

13 

13411 

13 

1816 

13 

21873 

13 

13343 

13 

41375 

13 

18017 

13 

1891 

13 

7942 

13 

50042 

13 

56739 

13 

62265 

Email  Address  Name 

klay@enron .com . 

lfan@mailman.enron.com .  "If an" 

steve . walker@enron . com . 


colin . skellettOenron . com . 

dave.ellis@enron.com .  Dave  Ellis . 

qwest .  net@mailman .  enron .  com .  "  smoray !  qwest .  net "  .  . 

resources.human@enron.com .  Human  Resources . 

cecil.stinemetz@enron.com .  Cecil  Stinemetz . 

mike.underwood@enron.com .  Mike  Underwood . 

rebecca.longoria@enron.com .  Rebecca  Longoria.... 

josh.duncan@enron.com .  Josh  Duncan . 

jeff.borg@enron.com .  Jeff  Borg . 

delia.walters@enron.com .  Delia  Walters . 

steve . dahnke@enron . com . 

ruth . mann@enron . com . 

dl-ga-all_enron_worldwide3@enron.com. . . .  DL-GA-all_enron_worl 

timothy . hubbard@enron . com . 

psmith3@enron . com . 

j im . barnes@enron .com . 

corn!983@enron . com . 


p(z|u) 

0.001601547055236 

0.000275370341249 

0.000237467793189 

0.000228362440052 

0.000183015269560 

0.000181180119423 

0.000175591724167 

0.000174198878161 

0.000166045381950 

0.000158315996298 

0.000146302011811 

0.000145705460432 

0.000144502959270 

0.000140111382700 

0.000134505674594 

0.000124086451007 

0.000123929995919 

0.000121327168634 

0.000120099584951 

0.000120099584951 


CATEGORY  14 

EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  4888  COMPONENTS:  11 

LARGEST  COMPONENT  SIZE:  4856  PERCENT  OF  TOTAL  GRAPH:  99. 35'/. 
GROUP  DEGREE:  0.06139  GRAPH  DENSITY:  0.00143 

GROUP  CLOSENESS:  0.00127  GROUP  BETWEENNESS:  0.08957 

AVERAGE  p(z | u) :  0.02  STDEVp(zlu):  0.02 

MOST  PROBABLE  USERS 


Topic#  ID# 

Email  Address 

Name 

p(z|u) 

14 

83555 

weatherwarn@mailman . enron . com . 

. .  0.001378078541067 

14 

83556 

subscribers@mailman . enron . com . 

. .  0.001347729515068 

14 

8676 

troy’ . ’brothers@enron.com . 

. .  0.001120182691548 

14 

3000 

tammy . gilmore@enron . com . 

Gilmore . . 

. .  0.001040966817847 

14 

1364 

laura . lantef ield@enron . com . 

.  Laura 

Lantef ield. .  . 

. .  0.001019182917989 

14 

14572 

j  ef f . nielsen@enron . com . 

.  Jeff  : 

Nielsen . 

. .  0.001013928475712 

14 

22443 

ben j  amin . schoene@enron . com . 

.  Benjamin  Schoene . . . 

. .  0.000996013660566 

14 

14742 

paul . tate@enron. com . 

.  Paul  1 

Tate . . 

. .  0.000965088538894 
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14 


0.000923310865935 


14 

14 

14 

14 

14 

14 

14 

14 

14 

14 

14 


11187  elberg.gelinOenron.com . 

21134  corey.wilkesOenron.com .  Corey  Wilkes . 

24546  dscott40enron.com . 

15268  eileen.peeblesOenron.com . 

54135  bnlletsOenron.com . 

549  lisa.kinseyOenron.com .  Lisa  Kinsey . 

53749  v.dickersonOenron .  com .  Steve  V  Dickerson... 

14816  Sebastian . corbachoOenron.com . 

44117  carolyn.descoteauxOenron.com . 

36476  paul.millerOenron.com . 

53750  michael.loefflerOenron.com .  Michael  Loeffler.... 

761  m.  .  tholtOenron.  com .  Jane  M.  Tholt . 


0.000916634570509 

0.000825500585103 

0.000792167307948 

0.000756070089463 

0.000748912162941 

0.000724070652120 

0.000656492998498 

0.000627984004197 

0.000620904414153 

0.000619806090657 

0.000607471228890 


******************************************************************************************************************* 
CATEGORY  15 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  8081  COMPONENTS:  17 

LARGEST  COMPONENT  SIZE:  8034  PERCENT  OF  TOTAL  GRAPH:  99.427, 
GROUP  DEGREE:  0.06638  GRAPH  DENSITY:  0.00087 

GROUP  CLOSENESS:  0.00060  GROUP  BETWEENNESS:  0.08977 

AVERAGE  p(z | u) :  0.02  STDEVp(zlu):  0.01 


MOST  PROBABLE  USERS 


Topic#  ID# 

15 

11411 

15 

22613 

15 

30920 

15 

17265 

15 

58588 

15 

39519 

15 

77620 

15 

59958 

15 

35461 

15 

14579 

15 

41633 

15 

29712 

15 

4556 

15 

85723 

15 

2618 

15 

8792 

15 

18943 

15 

32484 

15 

6322 

Email  Address 

phyllis . anzaloneOenron . com . 

tdonoho@enron . com . 

kwatsonOenron . com . 

f  sturm@enron . com . 

"undisclosed-recipient "@enron . com 

j  quenetOenron . com . 

f ermisOect . enron. com . 

tdonohoOect . enron . com . 

tmartinOect . enron . com . 

michael.schilmoeller@enron.com. . . 

f king@enron . com . 

vkamins@ect . enron . com . 

mlenhart@enron . com . 

blichte@enron . com . 

rick . wurlitzer@enron . com . 

sitara@enron . com . 

j  ames . monroe@enron . com . 

"the. desk" :@enron.com . 

rick.guttroff@enron. com . 


Name 


f sturm . 

"Undisclosed-Recipie 
Quenet . 


Michael  SCHILMOELLER 


Rick  Wurlitzer 

Sitara . 

James  Monroe . . 


p(z|u) 

0.001253298613513 

0.000914945170041 

0.000774913707859 

0.000749642270803 

0.000672556823129 

0.000654113616571 

0.000639238376987 

0.000558169368411 

0.000487853254502 

0.000442609712329 

0.000409298705275 

0.000406623618470 

0.000375768458329 

0.000307549527753 

0.000267904255165 

0.000259920051887 

0.000246386620151 

0.000237989286848 

0.000232005155999 
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15  24011  announcement.ees3enroii.com 


EES  Product  Announce  0 . 000225283228657 


CATEGORY  16 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  10117  COMPONENTS:  13 

LARGEST  COMPONENT  SIZE:  10085  PERCENT  OF  TOTAL  GRAPH:  99. 
GROUP  DEGREE:  0.09216  GRAPH  DENSITY:  0.00069 

GROUP  CLOSENESS:  0.00117  GROUP  BETWEENNESS:  0.10983 

AVERAGE  p(z I u) :  0.02  STDEVp(zlu):  0.02 


687, 


MOST  PROBABLE  USERS 


Topic#  ID# 
16  6169 
16  15138 
16  83222 
16  30930 
16  1838 
16  17950 
16  32307 
16  24080 
16  19236 
16  29742 
16  10959 
16  9372 
16  10779 
16  1827 
16  29741 
16  20337 
16  19662 
16  15911 
16  6165 
16  23987 


Email  Address  Name 

dottie .kerr3enron . com . 

kirk.neuner3enron.com .  Kirk  Neuner 


pickel .  robertSenron .  com . 

david.martin3enron. com . 

paul.lebeau3enron.com .  Paul  Lebeau . 

tracey.kozadinos3enron.com .  Tracey  Kozadinos  .  .  .  . 

andrew .  s  .  f  astowSenron.  com .  Andrew. S . Fast ow . 

registrar.isc3enron.com .  ISC  Registrar . 

lisa .  polkOenron .  com . 

mcar son3enron .com . 

althea . gordonOenr on . com . 

jef f rey .mcclellanSenron . com . 

michael ,rosen3enron. com . 

constance.charles3enron.com .  Constance  Charles... 

jballen3enron . com . 

ginger . gamble3enron. com . 

riccardo.bortolotti3enron.com . 

heather . j  ohnson3enron .com . 

ypo_international3enron . com . 

sammi3enron.com .  Sammi . 


p(z|u) 

0.000580233467676 

0.000558905753396 

0.000508826209101 

0.000473606941108 

0.000467157456745 

0.000459044558479 

0.000453735287827 

0.000451068744026 

0.000401456948249 

0.000360312055141 

0.000346135857069 

0.000344556711531 

0.000340814768847 

0.000340010446842 

0.000334910619895 

0.000317164129345 

0.000311737273871 

0.000308307655930 

0.000308089644486 

0.000301927804327 


CATEGORY  17 

EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  4851  COMPONENTS:  11 

LARGEST  COMPONENT  SIZE:  4817  PERCENT  OF  TOTAL  GRAPH:  99.30'/, 
GROUP  DEGREE:  0.09475  GRAPH  DENSITY:  0.00103 

GROUP  CLOSENESS:  0.00109  GROUP  BETWEENNESS:  0.14956 

AVERAGE  p(z | u) :  0.02  STDEVp(zlu):  0.02 
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MOST  PROBABLE  USERS 


Topic#  ID# 
17  37111 
17  3354 
17  37443 
17  81781 
17  52959 
17  4581 
17  31826 
17  24509 
17  87152 
17  62800 
17  20246 
17  85652 
17  38921 
17  82927 
17  52960 
17  64777 
17  87149 
17  81538 
17  52997 
17  24662 


Email  Address 

. stephens@enron . com . 

tom.ward@enron. com . 

’ f enner@enron .com . 

’  j  ernigan@enron  .com . 

’ thompson@enron .com . 

mday@enron .com . 

’ "sbigalow"@enron . com . 

’proctor@enron. com . 

’ rectorOenron .com . 

beth . cherrySenr  on .com . 

valerie . curtis@enron . com . 

vo . hoang@enr on .com . 

jennifer.ballas@enron.com. . . . 

’ peters@enron .com . 

’ lipper@enron .com . 

’ kef  f  er@enron .com . 

’ gapinski@enron .com . 

kalembka" . ’"lech@enron.com. . . 
assoc . ’ . ’ california@enron.com 
.  germany@enr  on  .com . 


Name  p(z|u) 

bridgeline .  0.001323511196538 

TOM  WARD .  0.000966067937626 


.  0.000937569931055 

.  0.000928733539493 

.  0.000920837164148 

.  0.000882216853996 

" sbigalow" .  0 . 000852736525420 

.  0.000850029138921 

.  0.000740618599562 

Beth  Cherry .  0.000710503914088 

.  0.000700356319002 

Hoang  Vo .  0.000694711807182 

Jennifer  Balias .  0.000671776000082 

.  0.000638237810676 

.  0.000613008799940 

.  0.000581600982873 

.  0.000573763927601 

.  0.000567123206439 

California  Cast  Meta  0.000563394640988 
wd .  0.000560715397035 


CATEGORY  18 

EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  2653  COMPONENTS:  17 

LARGEST  COMPONENT  SIZE:  2600  PERCENT  OF  TOTAL  GRAPH:  98.00'/, 
GROUP  DEGREE:  0.25123  GRAPH  DENSITY:  0.00339 

GROUP  CLOSENESS:  0.00077  GROUP  BETWEENNESS:  0.23940 

AVERAGE  p(z | u) :  0.02  STDEVp(zlu):  0.04 


MOST  PROBABLE  USERS 

Topic#  ID#  Email  Address  Name  p(z|u) 

18  3126  debora.whitehead@enron.com .  0.001354693023873 

18  49935  sstoness@enron.com .  0.001343278555921 

18  3136  leasa.lopez@enron.com .  0.001339989957675 

18  37928  dblack@enron.com .  0.001334889634941 

18  24207  .sue@enron.com .  e-mail .  0.001329385040080 

18  1629  ken.gustafson@enron.com .  0.001307291111144 

18  51122  jlewis@enron.com .  " .  0.001284769312297 

18  49932  tjohnso8@enron.com .  0.001275922795514 
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18 


James  D  Steffes 


0.001245590861660 


18 

18 

18 

18 

18 

18 

18 

18 

18 

18 

18 


0.001190484546632 

0.001164952932672 

0.001072665655453 

0.001048300576940 


48832  james_d_steffes@enroii.com 
28468  terry.donovan@enron.com. 
52415  johnson.tamara@enron.com 
49988  clar ian . vondrak@enron . com 
50118  fvickers@enron.com 
34229  mpalmer@enron.com. 

1228  mark.fillinger@enron.com. 
52531  savage.gordon@enron.com.. 

52809  sf  ’  .  '  sue@enron.  com . 

52530  coffing.timothy@enron.com 

46859  smara@enron.com . 

3228  .jim@enron.com . 


mpalmer@enron.com. . .  0.001047373129123 

Mark  Fillinger .  0.001027154035031 

Gordon  Savage .  0.001002542002956 

Sue  Mara  at  Enron  SF  0.001002248494077 

Timothy  Coffing .  0.000969899406956 

" .  0.000945990094958 

e-mail .  0.000936966686254 


***************************************************************************************************************** 
CATEGORY  19 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  7223  COMPONENTS:  13 

LARGEST  COMPONENT  SIZE:  7164  PERCENT  OF  TOTAL  GRAPH:  99.18"/, 
GROUP  DEGREE:  0.09008  GRAPH  DENSITY:  0.00083 

GROUP  CLOSENESS:  0.00047  GROUP  BETWEENNESS:  0.17975 
AVERAGE  p(z | u) :  0.02  STDEVp(zlu):  0.02 


MOST  PROBABLE  USERS 


Topic#  ID# 
19  1488 
19  22319 
19  3580 
19  1599 
19  7009 
19  6943 
19  15206 
19  19486 
19  8829 
19  3520 
19  46295 
19  18047 
19  6170 
19  2625 
19  60151 
19  11553 
19  27157 
19  7147 
19  15291 


Email  Address 

perfmgmt@enron. com . 

perfmgmt@ect . enron. com . 

tiffany . smith@enron.com . 

john.disturnal@enron.com. . . . 

j  ay . knoblauh@enron .com . 

gregory .  schockling@enron .  com 

j  ef  f . stephens@enr on . com . 

steve.beck@enron.com . 

jeffery.stephens@enron.com. . 

r agan . bond@enron .com . 

derrick. jr . @enron .com . 

project ,team@enron. com . 

henry .  emery@enr  on  .com . 

jose . f avela@enron . com . 

hendr  icksonSenr  on  .com . 

hodge@enr on .com . 

griffith@enron.com . 

cullen .  dukeSenr  on  .com . 

axisteam@enron. com . 


Name  p(z|u) 

"Performance  Evaluat  0.001525791142995 


Tiffany  Smith 


Jay  Knoblauh . 

Gregory  Schockling. . 
Jeff  Stephens . 


Jeffery  Stephens. . . . 
Ragan  Bond . 


Cullen  Duke 


0.001371552789879 

0.001153296600028 

0.001062847787452 

0.001029857296022 

0.000984788500517 

0.000978016231526 

0.000842856199101 

0.000831147356346 

0.000820963992964 

0.000560675749269 

0.000526954152307 

0.000507744129306 

0.000477683957123 

0.000454311071428 

0.000411327981537 

0.000397450507797 

0.000384117123224 


The  Associate  and  A  0.000373160231628 
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19  37128  dl-ga-pas@enron.com 


DL-GA-PAS 


0.000358662874751 


CATEGORY  20 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  7203  COMPONENTS:  19 

LARGEST  COMPONENT  SIZE:  7140  PERCENT  OF  TOTAL  GRAPH:  99.13'/, 
GROUP  DEGREE:  0.08949  GRAPH  DENSITY:  0.00069 

GROUP  CLOSENESS:  0.00047  GROUP  BETWEENNESS:  0.11971 

AVERAGE  p(z I u) :  0.02  STDEVp(zlu):  0.02 


MOST  PROBABLE  USERS 


Topic#  ID# 

20 

1530 

20 

22317 

20 

1213 

20 

228 

20 

19381 

20 

40864 

20 

15669 

20 

682 

20 

65 

20 

60202 

20 

40865 

20 

24256 

20 

6861 

20 

6891 

20 

686 

20 

83277 

20 

25054 

20 

83912 

20 

37242 

20 

52919 

Email  Address 

sandy ,morris@enron. com . 

heidi  .  dubose@enron.  com . 

backbone . ensSenron .com . 

michael.belmont@enron.com. . . 

jim. f ussellSenron . com . 

murray .bridgmount@enron . com. 

sscott5@enron . com . 

claudia.guerra@enron.com. . . . 

anna . mehr er@enr on .com . 

peter . f . keavey@enron . com .... 
datacomms . european@enron. com 

cgerman@ect .  enr  on  .com . 

david. wile@enron. com . 

gail.kettenbrink@enron.com. . 

d .  .  hogan@enr  on  .com . 

mtaylol@ect . enr on . com . 

enron .  gss@enron.com . 

mark . e . tay lor@enr on . com . 

bryan.vaclavik@enron.com. . . . 

5  pp 1 @enron .com . 


Name 

Sandy  Morris 


Michael  Belmont 


Murray  Bridgmount . . . 


Claudia  Guerra 


peter . f . keavey . 

European  DataComms. . 


David  Wile . 

Gail  Kettenbrink. . . . 
Irena  D .  Hogan . 


Enron  GSS 


Bryan  Vaclavik 
PP . 


p(z|u) 

0.000874483033774 

0.000830742698895 

0.000729138965996 

0.000687949885586 

0.000663413538894 

0.000633919853904 

0.000611384940309 

0.000570851660099 

0.000554254779448 

0.000538657148020 

0.000497040933653 

0.000492514042629 

0.000491017748451 

0.000470825018992 

0.000470467464701 

0.000439404166548 

0.000420930480870 

0.000417120023755 

0.000412479769939 

0.000409820643319 


CATEGORY  21 

EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  5312  COMPONENTS:  15 

LARGEST  COMPONENT  SIZE:  5262  PERCENT  OF  TOTAL  GRAPH:  99.06'/, 
GROUP  DEGREE:  0.16252  GRAPH  DENSITY:  0.00075 

GROUP  CLOSENESS:  0.00077  GROUP  BETWEENNESS:  0.36966 
AVERAGE  p(z | u) :  0.02  STDEVp(zlu):  0.02 
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MOST  PROBABLE  USERS 


Topic#  ID# 

21 

3780 

21 

3973 

21 

57681 

21 

6580 

21 

16217 

21 

18727 

21 

6837 

21 

20483 

21 

7571 

21 

6992 

21 

16110 

21 

74130 

21 

19886 

21 

6977 

21 

31730 

21 

74198 

21 

20467 

21 

18364 

21 

3357 

21 

53196 

Email  Address 


Name 


p(z|u) 


enron . mailsweeper . admin@enron . com 


0.001361717288649 


admin . enron@enron . com 


Enron  MailSweeper  Ad  0.001355363291674 


munder .mail . list ©mailman. enron. com . 

angela.barnett@enron.com .  Angela  Barnett . 

rose . botello@enron . com . 

regina.blackshear@enron.com .  Regina  Blackshear.  .  . 

diane.salcido@enron.com .  Diane  Salcido . 

amber . limas@enron . com . 

maria.sandoval@enron.com .  Maria  Sandoval . 

judy.hernandez@enron.com .  Judy  Hernandez . 

dbaughm@notes . enron . com . 

shauncy . mathews@enron . com . 

leslie . smith@enron . com . 

jennifer.cutaia@enron.com .  Jennifer  Cutaia . 

enron. messaging. administration@enron. com  . 

yvonne . acosta@enron . com . 

shirlet . williams@enron . com . 

nicole . mendez@enron . com . 

adam.senn@enron.com .  Adam  Senn . 

davette . warren@enron . com . 


0.001339082411155 

0.001095645310547 

0.001085409345096 

0.001036375541931 

0.001014008469762 

0.001006886782093 

0.000940728480294 

0.000914536743364 

0.000868739973280 

0.000848616431053 

0.000816267499590 

0.000799789279406 

0.000768852347271 

0.000729258614691 

0.000680051378254 

0.000660028430904 

0.000654768934867 

0.000646320487144 


CATEGORY  22 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  4561  COMPONENTS:  11 

LARGEST  COMPONENT  SIZE:  4515  PERCENT  OF  TOTAL  GRAPH:  98.997, 
GROUP  DEGREE:  0.07743  GRAPH  DENSITY:  0.00088 

GROUP  CLOSENESS:  0.00077  GROUP  BETWEENNESS:  0.14952 

AVERAGE  p(z | u) :  0.02  STDEVp(zlu):  0.01 


MOST  PROBABLE  USERS 


Topic#  ID# 
22  15169 
22  69639 
22  4989 
22  18735 
22  28817 
22  24009 
22  8399 
22  39389 


Email  Address 
jeff.pearson@enron.com. . . . 

spendegr@enron . com . 

debbie.doyle@enron.com. . . . 

lois . ford@enron.com . 

postmaster@corp . enron . com . 
landry.pamela@enron.com. . . 
jimmy.simien@enron.com. . . . 
cyntia . distef ano@enron . com 


Name 

Jeff  Pearson 


Lois  Ford 


pamela  landry . 

Jimmy  Simien . 

Cyntia  DiStef ano. . . . 


p(z|u) 

0.001253668216739 

0.000989841761548 

0.000869871909463 

0.000836060891752 

0.000832470263197 

0.000808897513137 

0.000797366149122 

0.000792161720882 
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22 

22 

22 

22 

22 

22 

22 

22 

22 

22 

22 

22 


1365  annette.glod@eiiron.com .  Annette  Glod . 

15192  mikeaenron.com .  Mike . 

69562  larry .  f  .  campbeliaenron.  com .  " . 

2188  s  .  .  gartneraenron.  com .  Julie  S.  Gartner.... 

69632  j  .kinser3enron.  com . 

82885  theresa_staabaenron.com . 

49937  rsanders3enron.com . 

44165  robin.borderaenron.com . 

79464  quickplaceanahou-lnww01.ots.enron.com. . .  "customerservice" . . . 

26498  scott.croweliaenron.com . 

1801  leesa.whiteaenron.com .  Leesa  White . 

8931  kevin.heaiaenron.com .  Kevin  Heal . 


0.000781796475648 

0.000766113550658 

0.000751543355103 

0.000738060823994 

0.000737706838406 

0.000704316279000 

0.000660653664704 

0.000637761479596 

0.000621914392699 

0.000607660447529 

0.000599852978852 

0.000596159519573 


****************************************************************************************************************** 
CATEGORY  23 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  6040  COMPONENTS:  18 

LARGEST  COMPONENT  SIZE:  5990  PERCENT  OF  TOTAL  GRAPH:  99.17"/, 
GROUP  DEGREE:  0.10322  GRAPH  DENSITY:  0.00066 

GROUP  CLOSENESS:  0.00064  GROUP  BETWEENNESS:  0.14962 

AVERAGE  p(z | u) :  0.02  STDEVp(zlu):  0.01 

MOST  PROBABLE  USERS 


Topic#  ID# 

Email  Address 

Name 

p(z|u) 

23 

79501 

gblair@ei . enron . com . 

0.000462412953153 

23 

21413 

ken@enron . com . 

. . . .  Ken . 

0.000323841152123 

23 

21422 

hill . heather@enron . com . 

0.000286077056466 

23 

14561 

hshivelOenron . com . 

0.000269967911928 

23 

59931 

mogelOenron . com . 

0.000243749594960 

23 

52372 

nahigian@enron . com . 

0.000192334598779 

23 

62694 

j  ason . r . williamsOenron . com . 

0.000187377638054 

23 

52357 

counihan@enron . com . 

0.000171950572198 

23 

52378 

violette@enron . com . 

0.000167424947171 

23 

11194 

omar . hasanOenron .com . 

0.000160489024376 

23 

26485 

dgiron@ect . enron .com . 

0.000155796766887 

23 

71916 

f  at imah . ducr os@enr on . com . 

. . . .  Fat imah 

Ducros . 

0.000153229223724 

23 

49830 

theizenraderOenron . com . 

0.000146050067599 

23 

22671 

dsmith3@enron . com . 

0.000145812358196 

23 

25760 

. pam@enron .com . 

0.000142357027217 

23 

59955 

america.dl-outlook@enron.com . 

. . . .  DL-Outlook  Users  Nor 

0.000129385918782 

23 

39271 

gstorey@enron . com . 

0.000128688760621 

23 

32865 

network . security@enron . com . 

0.000127291074582 

23 

13429 

cynthia.boseman-harris@enron.com. . . . 

. . . .  Cynthia 

Boseman-Harr 

0.000123471212527 
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23  58138  julie.a.gomez@enron.com 


0.000120555589723 


CATEGORY  24 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  6534  COMPONENTS:  16 

LARGEST  COMPONENT  SIZE:  6492  PERCENT  OF  TOTAL  GRAPH:  99.36'/, 
GROUP  DEGREE:  0.06928  GRAPH  DENSITY:  0.00092 

GROUP  CLOSENESS:  0.00076  GROUP  BETWEENNESS:  0.09970 

AVERAGE  p(z I u) :  0.02  STDEVp(zlu):  0.01 


MOST  PROBABLE  USERS 


Topic#  ID# 

24 

3047 

24 

56955 

24 

26651 

24 

15648 

24 

71298 

24 

11170 

24 

15399 

24 

20185 

24 

542 

24 

14872 

24 

5735 

24 

40227 

24 

11391 

24 

41826 

24 

47371 

24 

68584 

24 

20721 

24 

2299 

24 

37345 

24 

12687 

Email  Address  Name 

ted.robinson@enron.com .  Ted  Robinson 

alexandre .  bueno@enron  .com . 

joe.kolb@enron.com .  Joe  Kolb.... 


f arzad . f arhangnia@enron . com . 

rgay@enron .com . 

gabr iel . chavez@enr on .com . 

tmart in@enron .com . 

micha.makowsky@enron .  com . 

adam.johnson@enron.com .  Adam  Johnson . 

adam. siegelSenron . com . 

j  ohn . wodr aska@enr on . com . 

maurice.gilbert@enron.com .  Maurice  Gilbert . 

j  ames . bryj  a@enr on .com . 

bcc@enron . com . 

enron. environmental@enron. com . 

calgary .receptionSenron . com . 

rajneesh . salhotraSenron . com . 

trevor.woods@enron.com .  Trevor  Woods . 

grona . suzanne@enron . com . 

campbell.catherine@enron.com .  Catherine  Campbell.. 


p(z|u) 

0.000716540779482 

0.000625717174026 

0.000537573977194 

0.000507815820363 

0.000487205057792 

0.000464134121822 

0.000459964533289 

0.000455699179876 

0.000423893286668 

0.000399092942546 

0.000336312342496 

0.000332941298668 

0.000326345786482 

0.000307111662081 

0.000299723845509 

0.000298517616026 

0.000291543892562 

0.000281973476738 

0.000258010319290 

0.000254320251200 


CATEGORY  25 

EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  2949  COMPONENTS:  11 

LARGEST  COMPONENT  SIZE:  2890  PERCENT  OF  TOTAL  GRAPH:  98.00’/, 
GROUP  DEGREE:  0.16070  GRAPH  DENSITY:  0.00136 

GROUP  CLOSENESS:  0.00070  GROUP  BETWEENNESS:  0.26934 
AVERAGE  p(z | u) :  0.02  STDEVp(zlu):  0.03 
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MOST  PROBABLE  USERS 


Topic#  ID# 

25 

983 

25 

31991 

25 

6707 

25 

2104 

25 

33702 

25 

21046 

25 

30684 

25 

2078 

25 

8822 

25 

21151 

25 

15129 

25 

6991 

25 

32739 

25 

32865 

25 

6032 

25 

19763 

25 

29987 

25 

2092 

25 

1728 

25 

31231 

Email  Address 

westdesksupport@enron . com . 

julius .zajdaO enron. com . 

brad . romine@enron . com . 

tom . barkley@enron . com . 

adam . brulinski@enron . com . 

j  arek . dybowskiOenron . com . 

eloise .mezaOenron. com . 

j  ason . sokolovOenron . com . 

kenneth.parkhill@enron.com. . . . 

Steve . bigalow@enron . com . 

bessik.matchavariani@enron. com 

j  ohn . henderson@enron . com . 

wsmith2@enron . com . 

network.security@enron.com. . . . 

kevin . moore@enron . com . 

mary . bailey@enron . com . 

youyi . f eng@enron. com . 

r akesh . bharat i@enr on .com . 

mark . ruane@enron .com . 

kat j  a . schilling@enron . com . 


Name 


Brad  Romine 
Tom  Barkley 


Jarek  Dybowski 


Jason  Sokolov . 

Kenneth  Parkhill .... 

Steve  Bigalow . 

Bessik  Matchavariani 
John  Henderson . 


Rakesh  Bharat i 


p(z|u) 

0.001101294436163 

0.001085519268517 

0.000907610059737 

0.000896814524694 

0.000861365322836 

0.000819074111278 

0.000789624728705 

0.000769174246179 

0.000722874376853 

0.000700520452098 

0.000676921169226 

0.000577687162124 

0.000571572152068 

0.000541798179713 

0.000535716372260 

0.000519439944775 

0.000516879223629 

0.000497251843575 

0.000480913131608 

0.000476298585543 


CATEGORY  26 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  6559  COMPONENTS:  14 

LARGEST  COMPONENT  SIZE:  6527  PERCENT  OF  TOTAL  GRAPH:  99.517, 

GROUP  DEGREE:  0.07382  GRAPH  DENSITY:  0.00091 

GROUP  CLOSENESS:  0.00124  GROUP  BETWEENNESS:  0.09968 

AVERAGE  p(z | u) :  0.02  STDEVp(zlu):  0.01 

p(z|u) 

0.000588198618238 
0.000400120120558 
0.000396507373111 
0.000362655683908 
0.000337474714293 
0.000335546829323 
0.000334001683022 
0.000314439436021 


MOST  PROBABLE  USERS 

Topic#  ID#  Email  Address  Name 

26  3783  douglas.smith@enron.com .  Douglas  SMITH. 

26  12183  boardroom@enron.com .  Boardroom . 

26  39745  joao.albuquerque@enron.com . 

26  53169  .arm@enron.com .  e-mail . 

26  21296  money.in.motion@mailman.enron.com . 

26  587  lindsay.renaud@enron.com .  Lindsay  Renaud 

26  18361  s  .  .  yaoSenron.  com .  Anne  S.  Yao... 

26  1706  gary.peng@enron.com . 


26 


18460  jens.gobel@enron.com . 

17645  cassandra.chinkin@enron.com 


Jens  Gobel 


0.000305669278358 


26 

26 

26 

26 

26 

26 

26 

26 

26 

26 

26 


3502  e . . jones@enron. com . 

17733  padmesh.thuraisingham@enron.com. . . . 

48719  the.distribution@enron.com . 

9525  lkitchen@enron . com . 

2251  kikumi.kishigami@enron.com . 

80335  leonardo . cardoso@enron . com . 

82299  open2win . 011 1 . net ©mailman . enron . com 

416  jason.biever@enron.com . 

12155  david.tonsall@enron.com . 

56715  .boe@enron.com . 


Cassandra  Chinkin. . . 

Karen  E.  Jones . 

Padmesh  Thuraisingha 


Kikumi  Kishigami 
Leonardo  Cardoso. . . . 


Jason  Biever. 
David  Tonsall 
lawrence . 


************************************************************************: 
CATEGORY  27 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  7261  COMPONENTS:  15 

LARGEST  COMPONENT  SIZE:  7220  PERCENT  OF  TOTAL  GRAPH:  99. 44°/. 
GROUP  DEGREE:  0.29692  GRAPH  DENSITY:  0.02119 

GROUP  CLOSENESS:  0.16754  GROUP  BETWEENNESS:  0.29250 

AVERAGE  p(z | u) :  0.02  STDEVp(zlu):  0.02 


MOST  PROBABLE  USERS 


Topic#  ID# 

27 

17493 

27 

44102 

27 

7507 

27 

24848 

27 

77911 

27 

20010 

27 

16130 

27 

2828 

27 

4511 

27 

13179 

27 

19631 

27 

37179 

27 

778 

27 

29310 

27 

44164 

27 

1040 

27 

1264 

27 

969 

27 

769 

Email  Address 

. linda@enron . com . 

Christ i . culwell@enron. com. 
crystal.reyna@enron.com. . . 
maria.cisneros@enron.com. . 
wait . serrano@enron . com .... 

cstclai@enron.com . 

mills . bret@enron .com . 

j  ay . wills@enron . com . 

lgillet@enron . com . 

. sally@enron . com . 

sandra.mcnichols@enron.com 
fenner’ . ’mollyQenron. com. . 
ashley . worthing@enron . com . 
claudia.santos@enron.com. . 
jon.trevelise@enron.com. . . 

3 kuehn@enron . com . 

rose . rivera@enron . com . 

. tim@enron . com . 

laura.vargas@enron.com. . . . 


Name 

e-mail 


Crystal  Reyna 


Walt  Serrano 


Bret  Mills 


e-mail 


Molly  Fenner. . . 
Ashley  Worthing 
Claudia  Santos . 


Rose  Rivera. 

e-mail . 

Laura  Vargas 


0.000297263581907 

0.000292377166431 

0.000290774455809 

0.000284647663389 

0.000272276242132 

0.000269348319116 

0.000254792475714 

0.000253328876855 

0.000240887029362 

0.000235305103431 

0.000230033120632 

****************************************** 


p(z|u) 

0.000849482133330 

0.000831723790035 

0.000827492263327 

0.000715916931542 

0.000696540149539 

0.000677666209391 

0.000659217051214 

0.000643678177557 

0.000637170888441 

0.000627451721759 

0.000613256029816 

0.000613180814743 

0.000602175621182 

0.000593610671816 

0.000590106209773 

0.000581110745711 

0.000579042137249 

0.000554628316058 

0.000530206305260 
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27  44142  ron.beidelman@enron.com 


0.000522829926453 


CATEGORY  28 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  3175  COMPONENTS:  11 

LARGEST  COMPONENT  SIZE:  3137  PERCENT  OF  TOTAL  GRAPH:  98.80'/, 
GROUP  DEGREE:  0.06818  GRAPH  DENSITY:  0.00069 

GROUP  CLOSENESS:  0.00081  GROUP  BETWEENNESS:  0.13974 

AVERAGE  p(z I u) :  0.02  STDEVp(zlu):  0.01 


MOST  PROBABLE  USERS 


Topic#  ID# 

28 

20045 

28 

2598 

28 

1145 

28 

54044 

28 

6324 

28 

705 

28 

19973 

28 

8474 

28 

9057 

28 

17809 

28 

6859 

28 

76603 

28 

16386 

28 

35531 

28 

19397 

28 

81875 

28 

18021 

28 

76432 

28 

46020 

28 

21099 

Email  Address 

restricted.list@enron.com. . . . 

mark . smith@enron .com . 

benjamin.rogers@enron.com. . . . 

’ball@enron.com . 

mauboussin@enron.com . 

kor i . loibl@enron . com . 

greg .  blair@enron  .com . 

david.nutt@enron. com . 

scot . chambers@enron.com . 

h . . chin@enron .com . 

j .  . vitrella@enr on .com . 

ben j  amin . rogers@ect . enr on . com 
charles.vetters@enron.com. . . . 

renee . ratclif f @enron . com . 

patrick. grant@enron. com . 

. f elicia@enron. com . 

kirn . bolton@enron .com . 

Chris  ,norris@enron.  com . 

k .  . bargainer@enron .com . 

j  ef  f  .  hoover@enr  on  .com . 


Name 


Mark  Smith . 

Benjamin  Rogers 


Kori  Loibl 


David  Nutt 


Julia  H.  Chin . 

David  J.  Vitrella. . . 


Charles  Vetters 
Renee  Ratcliff. 


e-mail .... 
Kim  Bolton 


Jeff  Hoover 


p(z|u) 

0.001385620879894 

0.000788461751867 

0.000770889327724 

0.000665168283992 

0.000581147586028 

0.000525358167771 

0.000462418825328 

0.000446409716541 

0.000325154756165 

0.000293378166162 

0.000257094216471 

0.000249496461858 

0.000231380477064 

0.000215246773470 

0.000215129478301 

0.000181378371911 

0.000175105072265 

0.000159059632272 

0.000152013823351 

0.000132715750393 


CATEGORY  29 

EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  3242  COMPONENTS:  20 

LARGEST  COMPONENT  SIZE:  3191  PERCENT  OF  TOTAL  GRAPH:  98.43'/, 
GROUP  DEGREE:  0.29692  GRAPH  DENSITY:  0.02119 

GROUP  CLOSENESS:  0.16754  GROUP  BETWEENNESS:  0.29250 

AVERAGE  p(z | u) :  0.02  STDEVp(zlu):  0.04 
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MOST  PROBABLE  USERS 


Topic#  ID# 
29  53773 
29  76971 
29  17619 
29  19807 
29  53722 
29  7005 
29  24885 
29  54567 
29  19826 
29  54138 
29  18192 
29  42059 
29  41978 
29  24804 
29  35481 
29  18640 
29  23680 
29  43983 
29  24829 
29  43959 


Email  Address 

controllers.dl-etsaenron.com. . 
dl-etsgascontrollersaenron . com 

beverly .miller3enron .com . 

alma. carrilloaenron.com . 

angela . whiteaenron .com . 

j  ane . joyceaenron. com . 

ava.garciaaenron. com . 

r  enee  .  perryaenr  on  .com . 

kim .  perezaenron  .com . 

j  erry .  wilkensaenr on .  com . 

sandy . shef f ieldaenron. com . 

alicia.lendermanaenron.com. . . . 

dan . bunchaenron .com . 

kelly . allenaenr on .com . 

randy . bryanOenr  on .com . 

valerie . gilesaenron. com . 

rosemary . graceyaenron .com . 

j  oni  .  bollingeraenr  on  .com . 

Sharon . br ownaenron .com . 

j  ener  soaei  .  enron  .com . 


Name  p(z|u) 

DL-ETS  Gas  Controlle  0.001424273287272 
DL-ETS  Gas  Controlle  0.001257231555973 

Beverly  Miller .  0.001211426693265 

.  0.001136626725753 

Angela  White .  0.001117578736012 

Jane  Joyce .  0.000894385936708 

.  0.000878997392153 

Renee  Perry .  0.000839418723067 

.  0.000815505379513 

.  0.000809453965912 

.  0.000807162895742 

.  0.000782401198273 

.  0.000753560473298 

.  0.000708985549170 

Randy  Bryan .  0.000661569025376 

Valerie  Giles .  0.000656297229909 

.  0.000634957388910 

.  0.000626956792132 

.  0.000616777142454 

.  0.000612136420482 


CATEGORY  30 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  3984  COMPONENTS:  6 

LARGEST  COMPONENT  SIZE:  3971  PERCENT  OF  TOTAL  GRAPH:  99.67'/, 
GROUP  DEGREE:  0.05718  GRAPH  DENSITY:  0.00100 

GROUP  CLOSENESS:  0.00554  GROUP  BETWEENNESS:  0.10933 

AVERAGE  p(z | u) :  0.02  STDEVp(zlu):  0.03 


MOST  PROBABLE  USERS 


Topic#  ID#  Email  Address 


Name 


p(z|u) 


30 

30 

30 

30 

30 

30 

30 

30 


15949  alexandra.saleraenron.com 


6933  gabriel.monroyaenron.com .  Gabriel  Monroy 

3015  ted.evansaenron.com .  Ted  Evans . 

5645  wilson.kriegeiaenron.com . 


29138  douglas.nicholsaenron.com . 

61637  charlotte.kraham3enron.com . 

22344  becky.pitreaenron.com . 

18956  cecilia.rodriguezaenron.com .  Cecilia  Rodriguez... 


0.001095899895894 

0.001034703115733 

0.000948143362125 

0.000859855612774 

0.000851231371145 

0.000838904049729 

0.000815326423115 

0.000796196467072 
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30 


10T5T  alma.martinez@enron.com 


0.000790564297686 


30 

30 

30 

30 

30 

30 

30 

30 

30 

30 

30 


2197  ted.noble@enron.com .  Ted  Noble... 

30824  joanne.smith@enron.com .  JoAnne  Smith 

66817  thomas . ’paulSenron.  com . 


11322  elizabeth.peters@enron.com 


22  crystal.hyde@enron.com .  Crystal  Hyde 

44299  lnemec@ect.enron.com .  "Lisa  Nemec" 


11168  tobin.carlson@enron.com 
19602  dianne.swiber@enron.com 


0.000768644742282 

0.000745825496676 

0.000713062366461 

0.000711554608652 

0.000686471982273 

0.000681339667818 

0.000671220769090 

0.000667135943555 


3643  alexandra.villarreal@enron.com .  Alexandra  Villarreal  0.000658175910057 


104  laura.wente@enron.com .  0.000647290680664 

2060  pete.heintzelman@enron.com .  Pete  Heintzelman .  .  .  .  0.000643179613152 


***************************************************************************************************************** 
CATEGORY  31 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  4177  COMPONENTS:  20 

LARGEST  COMPONENT  SIZE:  4113  PERCENT  OF  TOTAL  GRAPH:  98.47"/, 
GROUP  DEGREE:  0.06371  GRAPH  DENSITY:  0.00120 

GROUP  CLOSENESS:  0.00055  GROUP  BETWEENNESS:  0.08940 

AVERAGE  p(z | u) :  0.02  STDEVp(zlu):  0.02 


MOST  PROBABLE  USERS 


Topic#  ID# 

31 

14993 

31 

38660 

31 

9987 

31 

46388 

31 

6156 

31 

1396 

31 

58222 

31 

4638 

31 

4636 

31 

41562 

31 

76758 

31 

35486 

31 

69556 

31 

41645 

31 

592 

31 

64778 

31 

5671 

31 

24913 

31 

17913 

Email  Address  Name  p(z|u) 

eserver@enron.com .  eserverSenron .  com@EN  0.001535657667116 


esaibi@enron  .com . 

j  derr ic@enron .com . 

ed.cattigan@enron.com .  Ed  Cattigan . 

j  .harris@enron.  com . 

vicsandra.trujillo@enron.com .  Vicsandra  Trujillo.. 

status_updates@enron . com . 

j  ames . derr ick@enron . com . 

expense .  r  eport@enr  on  .com . 

esager2@enron .com . 

cms.router@enron.com .  CMS  Router . 

h.fields@enron.com .  Sharon  H  Fields . 

enron . general . announcements . enronxgate@e  . 

lena .  kasbekar@enr  on .  com . 

paul.schiavone@enron.com .  Paul  Schiavone . 

1  deberry@enr  on  .com . 

kelly .  j  ohnson@enron .  com . 

gary . hugo@enron .com . 

deborah.heath@enron.com .  Deborah  Heath . 


0.001381493992136 

0.001109771432530 

0.000979791192791 

0.000614701006350 

0.000368189267180 

0.000354788031081 

0.000304719850918 

0.000289673888247 

0.000281996079695 

0.000268302778046 

0.000263009472195 

0.000222602733185 

0.000218030097477 

0.000201423346559 

0.000190172446330 

0.000175098701755 

0.000172751462835 

0.000169617204329 
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31  29445  all.users@enroii.com 


0.000169571501677 


CATEGORY  32 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  11931  COMPONENTS:  4 

LARGEST  COMPONENT  SIZE:  11895  PERCENT  OF  TOTAL  GRAPH:  99. 
GROUP  DEGREE:  0.07389  GRAPH  DENSITY:  0.00059 

GROUP  CLOSENESS:  0.00079  GROUP  BETWEENNESS:  0.08986 

AVERAGE  p(z I u) :  0.02  STDEVp(zlu):  0.02 


707. 


MOST  PROBABLE  USERS 


Topic#  ID# 

32 

6007 

32 

6226 

32 

6227 

32 

83 

32 

2781 

32 

1176 

32 

412 

32 

1078 

32 

18346 

32 

2883 

32 

2590 

32 

2023 

32 

6242 

32 

798 

32 

8645 

32 

54150 

32 

5582 

32 

811 

32 

61532 

32 

95 

Email  Address 


Name 


p(z|u) 


houston . report@enron . com . 

enr on . announcement@enron .com . 

enron . action@enron. com . 

ena.relations@enron.com .  ENA  Public  Relations 

enron . announcements@enron. com . 

ethink@enron.com .  ethink . 

no .  address@enron.com . 

40enron@enron.com .  Tracey  Ramsey  -  Glob 

litebytz@enron.com . 

all . houstonSenron . com . 

lauren.schlesinger@enron.com .  Lauren  Schlesinger .  . 

officeofthechairman2Senron.com .  Office  of  the  Chairm 

all.downtown@enron.com .  All  Enron  Downtown.. 

announcements.enron@enron.com .  Enron  General  Announ 

dl-ga-all_enron_houston@enron .com .  DL-GA-all_enr on_hous 

gpg . announcement@enr on .com . 

body . shop@enron .com . 

administration.enron@enron.com .  Enron  Messaging  Admi 

runners@enron .com . 

center.dl-portland@enron.com .  DL-Portland  World  Tr 


0.001430642023732 

0.001228008721844 

0.001129377391420 

0.001027423116974 

0.000986024421941 

0.000955751506016 

0.000909868666438 

0.000847021080647 

0.000788661449814 

0.000743423898866 

0.000703635634063 

0.000702511607197 

0.000701062144669 

0.000684712203801 

0.000676478304923 

0.000675235845174 

0.000657937998171 

0.000619679790514 

0.000601914016209 

0.000593832578067 


CATEGORY  33 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  6300  COMPONENTS:  8 

LARGEST  COMPONENT  SIZE:  6285  PERCENT  OF  TOTAL  GRAPH:  99.76'/, 
GROUP  DEGREE:  0.07417  GRAPH  DENSITY:  0.00095 

GROUP  CLOSENESS:  0.00432  GROUP  BETWEENNESS:  0.13968 

AVERAGE  p(z | u) :  0.02  STDEVp(zlu):  0.02 
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MOST  PROBABLE  USERS 


Topic#  ID# 

33 

7514 

33 

18854 

33 

14946 

33 

7528 

33 

3079 

33 

18877 

33 

43984 

33 

11222 

33 

17270 

33 

18901 

33 

6221 

33 

18923 

33 

17598 

33 

13263 

33 

1999 

33 

11254 

33 

11207 

33 

68620 

33 

41985 

33 

4925 

Email  Address 

craig . rickardOenron . com . 

teresa . aguilera-peon@enron . com 

cecil . stapleyOenron . com . 

t . . robinson@enron . com . 

r . . connerQenron . com . 

greg . bruch@enron .com . 

loren . penkavaOenron . com . 

david . marye@enron . com . 

tim.johan son® enron. com . 

dirk . dimitry®enron . com . 

larimore®enron . com . 

elizabeth . hutchinson®enron . com 
olivier .herbelot®enron. com. . . . 

robert . gerry®enron . com . 

j  aime . gualy®enron . com . 

ami . thakkar®enron . com . 

danilo . juvane®enron. com . 

harora®ect . enron .com . 

alien . cohrs®enron . com . 

lou . pot empa® enron . com . 


Name 

Craig  Rickard . 

Maria  Teresa  Aguiler 


Richard  T.  Robinson. 
Andrew  R.  Conner. . . . 
Greg  Bruch . 


Dirk  Dimitry 


Elizabeth  Hutchinson 
Olivier  Herbelot .... 

Robert  Gerry . 

Jaime  Gualy . 


"harpreet " 


CATEGORY  34 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  8465  COMPONENTS:  14 

LARGEST  COMPONENT  SIZE:  8420  PERCENT  OF  TOTAL  GRAPH:  99.47°/, 
GROUP  DEGREE:  0.07955  GRAPH  DENSITY:  0.00059 

GROUP  CLOSENESS:  0.00071  GROUP  BETWEENNESS:  0.15979 

AVERAGE  p(z | u) :  0.02  STDEVp(zlu):  0.01 


MOST  PROBABLE  USERS 


Topic#  ID# 
34  8722 
34  77707 
34  8629 
34  33798 
34  8799 
34  70956 
34  29515 
34  8749 


Email  Address  Name 

home . owner®mailman . enron . com . 

plucci®enron.  com .  Paul  Lucci 

undisclosed . recipient s®mailman . enron . com  . 

us . home . owner ©mailman . enron . com . 

event ©mailman . enron . com . 

home . ref inace ©mailman . enron . com . 

postmaster®mailboy . enron . com . 

user®mailman . enron . com . 


p(z|u) 

0.001222342981691 

0.001161656534928 

0.001158314626099 

0.001055521301486 

0.001003383371760 

0.001002888020087 

0.000940013378977 

0.000893487419038 

0.000832301378341 

0.000766225906147 

0.000702638068746 

0.000694783161391 

0.000646370839464 

0.000630807518148 

0.000605399939833 

0.000602358283799 

0.000589967045480 

0.000588929000308 

0.000586888788662 

0.000583993987068 


p(z|u) 

0.000919970216830 

0.000914890164452 

0.000787836984093 

0.000555623544131 

0.000466134097510 

0.000450228711128 

0.000424305594125 

0.000407537199994 
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34 


8779  valued . client Omailman . enron . com 


0.000391976300601 


34  62311  zone34@mailman.enron.com . 

34  2347  h.  .  lewis@enron.  com .  Andrew  H.  Lewis . 

34  2453  undisclosed-recipients@enron.com .  undisclosed-recipien 

34  8654  help.3al2_@mailman.enron.com . 

34  29948  valuableclients@mailman.enron.com . 

34  33801  b3@mailman.enron.com . 

34  6734  calvin.lee@enron.com .  Calvin  Lee . 

34  29920  wwenger@enron.com . 

34  30158  ndisclosed.recipients@mailman.enron.com.  ndisclosed. Recipient 

34  37149  union.credit@enron.com .  Credit  Union . 

34  8759  valued.home.owner@mailman.enron.com . 

************************************************************************: 
CATEGORY  35 

EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  7568  COMPONENTS:  23 

LARGEST  COMPONENT  SIZE:  7464  PERCENT  OF  TOTAL  GRAPH:  98.63"/, 

GROUP  DEGREE:  0.16509  GRAPH  DENSITY:  0.00079 

GROUP  CLOSENESS:  0.00024  GROUP  BETWEENNESS:  0.23979 

AVERAGE  p(z | u) :  0.02  STDEVp(zlu):  0.01 

MOST  PROBABLE  USERS 


Topic#  ID# 

Email  Address 

Name 

35 

8780 

kensey  subscriber@mailman.enron.com. . . 

35 

8594 

kkeiser@enron . com . 

35 

47904 

dlassere@enron . com . 

35 

57370 

sshackl@ect . enron . com . 

.  .  "Sara 

Shackelton  " . . 

35 

34417 

alewis@ect . enron . com . 

. .  Andrew  Lewis . 

35 

41564 

sshackl@enron . com . 

35 

62034 

jwillia@enron . com . 

35 

34375 

alewis@enron . com . 

35 

60089 

shendri@ect . enron . com . 

. .  scott 

hendrickson. . . 

35 

41521 

brapp@enron . com . 

35 

34474 

the . daytrader@enron . com . 

35 

16172 

dbaughm@ect . enron . com . 

35 

21140 

david . mayeux@enron . com . 

35 

29847 

hot39d@mailman . enron . com . 

35 

19928 

pallen@enron . com . 

35 

6355 

amo s c on i@ enron . com . 

35 

56804 

ect . security@enron . com . 

35 

15719 

liz . hillman@enron . com . 

35 

47358 

slandwe@enron . com . 

0.000332889620982 

0.000286785135164 

0.000281521579392 

0.000278870249796 

0.000269589948415 

0.000261015534416 

0.000246829661997 

0.000246085990346 

0.000243912617602 

0.000237150152396 

0.000232339513153 

****************************************** 


p(z|u) 

0.001506825392421 

0.000891309311095 

0.000557234183126 

0.000510391836097 

0.000494223752658 

0.000474014224776 

0.000446540030464 

0.000446372437523 

0.000436527685215 

0.000375700502252 

0.000351200037885 

0.000317865806771 

0.000299770428530 

0.000296390397385 

0.000294949309447 

0.000245879955333 

0.000229024856121 

0.000221079706631 

0.000220409623075 
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35 


6360  shwabOenron.com 


0.000206628310012 


CATEGORY  36 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  6758  COMPONENTS:  15 

LARGEST  COMPONENT  SIZE:  6719  PERCENT  OF  TOTAL  GRAPH:  99.42'/, 
GROUP  DEGREE:  0.12011  GRAPH  DENSITY:  0.00089 

GROUP  CLOSENESS:  0.00086  GROUP  BETWEENNESS:  0.19975 

AVERAGE  p(z I u) :  0.02  STDEVp(zlu):  0.01 


MOST  PROBABLE  USERS 


Topic#  ID# 

36 

14664 

36 

3053 

36 

14663 

36 

30596 

36 

10238 

36 

6842 

36 

41574 

36 

617 

36 

15183 

36 

18654 

36 

41351 

36 

15581 

36 

37136 

36 

20070 

36 

68958 

36 

6142 

36 

41581 

36 

24452 

36 

6228 

36 

84400 

Email  Address 

phil .  cliffordSenron.com . 

john. wilsonSenron . com . 

bill .  br  iggsSenr  on  .com . 

sarah.wesnerSenron.com . 

mary . ruf f erSenron . com . 

don .  schroederSenron .  com . 

j  armogiSenron .com . 

greg. whitingSenron. com . 

1 .  .  wilsonSenron  .com . 

c .  gr  if  f  inSenron  .com . 

nymex . listSenron. com . 

brian_hoskinsSenron.com . 

counsel . daveSenron .com . 

pavel . zadorozhnySenr on .com . 

pmgSenron  .com . 

esop . americaSenron. com . 

bdavisSenron .  com . 

mailing . dl-ga-all_specialSenron . com 

eviteSenr on .com . 

mike . harper . enr onxgateSenr on . com . . . 


Name  p(z|u) 

.  0.001117969429071 

John  Wilson .  0.001052278465631 

.  0.001045070013300 

.  0.000732370764522 

.  0.000715440982101 

Don  Schroeder  Jr _  0.000440779983176 

.  0.000426096256999 

Greg  Whiting .  0.000393924165010 

John  L.  Wilson .  0.000359504521779 

.  0.000255931690006 

.  0.000249573820242 

"Brian  Hoskins" .  0.000232340671915 

Assistant  General  Co  0.000225210149311 

.  0.000216111924819 

.  0.000212073342463 

.  0.000210246004258 

.  0.000201070751783 

DL-GA-all_special  ma  0.000181896550497 

.  0.000181855283811 

.  0.000175735047509 


CATEGORY  37 

EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  6063  COMPONENTS:  29 

LARGEST  COMPONENT  SIZE:  5991  PERCENT  OF  TOTAL  GRAPH:  98.81'/, 
GROUP  DEGREE:  0.13825  GRAPH  DENSITY:  0.00082 

GROUP  CLOSENESS:  0.00041  GROUP  BETWEENNESS:  0.26972 

AVERAGE  p(z | u) :  0.02  STDEVp(zlu):  0.01 
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MOST  PROBABLE  USERS 


Topic#  ID# 
37  61754 
37  71959 
37  34651 
37  15381 
37  62384 
37  24019 
37  44859 
37  70877 
37  70827 
37  70836 
37  70797 
37  70959 
37  22613 
37  70910 
37  34866 
37  29915 
37  87104 
37  8490 
37  81740 
37  70929 


Email  Address 

jschwieSect . enron . com . 

aaSmailman . enron . com . 

philbilSect .  enron .  com . 

ebassSect . enron . com . 

mwoodsonSenron. com . 

backroads .travel .updateSmailman. enron . co 

mwhittSenron. com . 

list . subscriberSmailman . enron. com . 

start . the .new. year .with . a. clean. slateSma 

schlenker ©mailman . enron . com . 

barrSmailman . enron .com . 

valuable . customerSmailman. enron. com . 

tdonohoSenron .com . 

subscriberSmailman. enron. com . 

tkuyken@enron .com . 

carterSmailman. enron . com . 

hiSmailman. enron. com . 

inf  oSenron  .com . 

plucciSect . enron. com . 

marrowSmailman. enron . com . 


Name  p(z|u) 

JSCHWIE .  0.000696842147218 


.  0.000693156788563 

.  0.000620391215386 

.  0.000583114156419 

.  0.000487794592246 

.  0.000472587110233 

.  0.000467571333293 

.  0.000411887926942 

.  0.000388682712186 

.  0.000366137384702 

.  0.000350957031855 

.  0.000316567669689 

.  0.000300560372321 

.  0.000258586675413 

tkuyken .  0.000237788283081 

.  0.000227801729953 

.  0.000222889717125 

Enron  Consumer  Affai  0.000216785852683 

.  0.000212485211463 

.  0.000202602422447 


CATEGORY  38 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  9483  COMPONENTS:  25 

LARGEST  COMPONENT  SIZE:  9382  PERCENT  OF  TOTAL  GRAPH:  98. 93'/. 

GROUP  DEGREE:  0.14278  GRAPH  DENSITY:  0.00063 

GROUP  CLOSENESS:  0.00022  GROUP  BETWEENNESS:  0.23984 

AVERAGE  p(z | u) :  0.02  STDEVp(zlu):  0.03 

p(z|u) 

0.001214757260154 
0.001190479703897 
0.001150836312456 
0.001076378797115 
0.001024856235163 
0.000992002873818 
0.000985890599981 
0.000958813419644 


MOST  PROBABLE  USERS 

Topic#  ID#  Email  Address  Name 

38  22643  pmims3enron.com . 

38  22403  dfarmerSect.enron.com . 

38  4493  lcampbelSenron.com . 

38  78538  kholstSect.enron.com . 

38  79414  lblairSenron.com . 

38  72458  jtholtSect.enron.com . 

38  85541  tgeaccoSenron.com . 

38  36000  emclaugSenron.com . 
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38 


0.000935988817177 


38 

38 

38 

38 

38 

38 

38 

38 

38 

38 

38 


68835  jarnoldOect.enron.com . 

81740  plucci@ect.enron.com . 

60973  kruscit@ect.enron.com .  Kevin . 

74278  kwardOect.enron.com .  "Kim  Ward" 


77844  jreitmeOenron.com . 

78508  kholst@enron.com . 

71045  mcuilla@ect.enron.com .  Martin  Cuilla 

36250  pmimsOect.enron.com . 

35021  plove@ect.enron.com . 

71324  tkuyken@ect.enron.com . 

34866  tkuyken@enron.com .  tkuyken . 

4310  ebass@enron.com . 


0.000921286502272 

0.000910087386993 

0.000904572493622 

0.000874476844228 

0.000868779182142 

0.000859396701212 

0.000844357143993 

0.000837649877088 

0.000837299636707 

0.000831368009449 

0.000830791850316 


***************************************************************************************************************** 
CATEGORY  39 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  7900  COMPONENTS:  23 

LARGEST  COMPONENT  SIZE:  7839  PERCENT  OF  TOTAL  GRAPH:  99.23"/, 
GROUP  DEGREE:  0.18034  GRAPH  DENSITY:  0.00063 

GROUP  CLOSENESS:  0.00045  GROUP  BETWEENNESS:  0.29982 

AVERAGE  p(z | u) :  0.02  STDEVp(zlu):  0.01 


MOST  PROBABLE  USERS 


Topic#  ID# 
39  41733 
39  22614 
39  81838 
39  48549 
39  26360 
39  82421 
39  69405 
39  53412 
39  53015 
39  36409 
39  24830 
39  17165 
39  33916 
39  31719 
39  68037 
39  12151 
39  37735 
39  55204 
39  63803 


Email  Address  Name 

dperlin@enron.com .  "dperlin@enron.com"  . 

pkeaveyOenron .com . 

tstaab@enron.com .  Theresa  Staab . 

mike .  hernandezSenr  on  .com . 

dgironSenron .com . 

jmoore@enron . com . 

khyatt@enron. com . 

Siegel . avramOenron. com . 

siegel ’ . ’ avram@enron . com . 

newport-news . comOmailman. enron.com . 

eileen . buerkertOenron .com . 

clickathome .mailoutOenron. com . 

todd.bowen@enron.com .  Todd  Bowen . 

erisk@enron.com . 

gary . kane@enron .com . 

diana.peters@enron.com .  Diana  Peters . 

kprestoOect . enron . com . 

home .  dvdOmailman .  enron .  com . 

tamayo@enron . com . 


p(z|u) 

0.000903739067878 

0.000789044685087 

0.000583248027857 

0.000418949006629 

0.000398157756866 

0.000334568870084 

0.000295634334730 

0.000270611524906 

0.000268987115314 

0.000268667343561 

0.000268428184913 

0.000260881875838 

0.000256691757385 

0.000255897991206 

0.000247546227294 

0.000225040004984 

0.000221535967321 

0.000214326914687 

0.000214326914687 
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39  24038  robert.pechar@enron.com 


Robert  Pechar 


0.000195555424072 


CATEGORY  40 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  3198  COMPONENTS:  13 

LARGEST  COMPONENT  SIZE:  3151  PERCENT  OF  TOTAL  GRAPH:  98.53% 
GROUP  DEGREE:  0.15501  GRAPH  DENSITY:  0.00094 

GROUP  CLOSENESS:  0.00091  GROUP  BETWEENNESS:  0.30925 
AVERAGE  p(z I u) :  0.02  STDEVp(zlu):  0.02 


MOST  PROBABLE  USERS 


Topic#  ID# 

40 

37217 

40 

37114 

40 

37115 

40 

37146 

40 

37122 

40 

37220 

40 

29102 

40 

37151 

40 

37157 

40 

37152 

40 

37218 

40 

37216 

40 

37147 

40 

37223 

40 

37159 

40 

61559 

40 

703 

40 

37161 

40 

14855 

40 

14863 

Email  Address 

c_r_zander@enron .com . 

wollam ’ . ’ er ik@enron . com . 

chet .  f  enner@enr on  .com . 

f  enner . chet@enr on .com . 

wollam . er ik@enr on .com . 

erwollam@enr on .com . 

chambers . j  ohn@enron . com . 

knipe  ’  .  ’  chadSenron  .com . 

Constantine’ . ’brian@enron.com 

corr  ier .  brad@enron  .com . 

feder’ . ’t@enron.com . 

f  eder . t@enron .com . 

knipe . chad@enron .com . 

chet  _f  enner  @enr  on  .com . 

mccomb .keith@enron. com . 

j  arrod . haughn@enr on . com . 

matthew.lenhart@enron.com. . . . 

mccomb . chris@enron. com . 

luis . mena@enron .com . 

purvi . patel@enron . com . 


Name 


Chet  Fenner 


John  Chambers 
chad  knipe . . . 


Brad  Corrier 


chad  knipe 


Keith  McComb 


Matthew  Lenhart 
Chris  McComb. . . 


p(z|u) 

0.001362062064994 

0.001352652606597 

0.001333974284042 

0.001324966830009 

0.001286542806865 

0.001217350755984 

0.001192661493249 

0.001191418349095 

0.001141971905505 

0.001123786642376 

0.001098026456433 

0.001086028832547 

0.001016546916833 

0.001015054490788 

0.000995266434670 

0.000858525615543 

0.000802244258104 

0.000778106690150 

0.000636448337936 

0.000619514335550 


CATEGORY  41 

EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  3755  COMPONENTS:  17 

LARGEST  COMPONENT  SIZE:  3671  PERCENT  OF  TOTAL  GRAPH:  97.76% 
GROUP  DEGREE:  0.05087  GRAPH  DENSITY:  0.00107 

GROUP  CLOSENESS:  0.00045  GROUP  BETWEENNESS:  0.11925 
AVERAGE  p(z | u) :  0.02  STDEVp(zlu):  0.01 
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MOST  PROBABLE  USERS 


Topic#  ID# 

41 

73730 

41 

15748 

41 

15729 

41 

71023 

41 

16893 

41 

2561 

41 

1140 

41 

1129 

41 

1952 

41 

22519 

41 

15722 

41 

1881 

41 

11111 

41 

52720 

41 

61577 

41 

38006 

41 

15516 

41 

53731 

41 

52729 

41 

2515 

Email  Address  Name 

mariachi.el@enron.com .  El  Mariachi 

ploveOenron . com . 


gary . w . lamphierOenron . com . 

wkasemeOenron . com . 

carter.ellis@enron.com .  Carter  Ellis . 

reagan.mathews@enron.com .  Reagan  Mathews . 

joe.stepenovitch@enron.com .  Joe  Stepenovitch.  .  .  . 

andy.pace@enron.com .  Andy  Pace . 

valerie.ramsower@enron.com .  Valerie  Ramsower.... 

mike . morris@enron . com . 

william . kasemervisz@enron . com . 


greg.martin@enron.com .  Greg  Martin... 

michael.morris@enron.com .  Michael  Morris 

1  .  ’  andre@enron .  com .  e-mail . 

dl-erc@enron.com .  DL-ERC . 


servello . anthony@enron . com 
o 5 neal . winf ree@enron . com . . 


shannon.ed@enron.com .  Ed  Shannon... 

kenny’ . * denis@enron. com . 

martin.cuilla@enron.com .  Martin  Cuilla 


p(z|u) 

0.001367308136551 

0.000762304310293 

0.000431011007515 

0.000396938966542 

0.000358006784754 

0.000348886178625 

0.000303042576406 

0.000249534277156 

0.000245887393921 

0.000229771976309 

0.000214208468158 

0.000212687621422 

0.000204999101831 

0.000204009703757 

0.000191261744055 

0.000188452133106 

0.000185441179200 

0.000184879075621 

0.000176946199448 

0.000176037943189 


CATEGORY  42 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  7411  COMPONENTS:  11 

LARGEST  COMPONENT  SIZE:  7389  PERCENT  OF  TOTAL  GRAPH:  99.70'/, 
GROUP  DEGREE:  0.15184  GRAPH  DENSITY:  0.00081 

GROUP  CLOSENESS:  0.00263  GROUP  BETWEENNESS:  0.30977 

AVERAGE  p(z | u) :  0.02  STDEVp(zlu):  0.02 


MOST  PROBABLE  USERS 


Topic#  ID#  Email  Address 


Name 


p(z|u) 


42 

42 

42 

42 

42 

42 

42 

42 


24810  rita.bahneraenron.com . 

9021  richard.babinaenron.com... 
2990  warrick.franklin3enron.com 

16301  .marc3enron.com . 

15530  .mom3enron.com . 

8838  r . . lillySenron. com . 

81467  .genia3enron.com . 

3788  .erik3enron.com . 


.  0.000805608717791 

.  0.000797487326104 

Warrick  Franklin....  0.000732665859683 

e-mail .  0.000732426166615 

e-mail .  0.000719260565466 

Kyle  R.  Lilly .  0.000713713041196 

e-mail .  0.000713374066709 

e-mail .  0.000707024570697 
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42  12449  dave.lawlor@enron.com . 

42  9044  susan.bulgawicz@enron.com . 

42  12447  j  im .  r  oth@enr  on  .com .  Jim  Roth .  . 

42  51739  antoine.duvauchelle@enron.com . 

42  16082  thiem’  .  ’matt@enron.  com .  Matt  Thiem 


42  61565  ’hotmail.com@enron.com . 

42  12448  dan.botsch@enron.com . 

42  16087  mudd’  .  ’lisa@enron.  com .  Lisa  Mudd 

42  37659  ’tconl.com@enron.com . 

42  19277  benjamin.freeman@enron.com . 


42  85338  .mordente@enron.com .  e-mail . 

42  41104  allison.healy-poe@enron.com .  Allison  Healy-Poe... 


************************************************************************: 
CATEGORY  43 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  6715  COMPONENTS:  17 

LARGEST  COMPONENT  SIZE:  6655  PERCENT  OF  TOTAL  GRAPH:  99. 11°/. 
GROUP  DEGREE:  0.09178  GRAPH  DENSITY:  0.00089 

GROUP  CLOSENESS:  0.00048  GROUP  BETWEENNESS:  0.14972 

AVERAGE  p(z | u) :  0.02  STDEVp(zlu):  0.01 


MOST  PROBABLE  USERS 


Topic#  ID# 

43 

44150 

43 

2453 

43 

76756 

43 

55936 

43 

85 

43 

62868 

43 

78819 

43 

35833 

43 

41550 

43 

55710 

43 

81356 

43 

6036 

43 

20326 

43 

41075 

43 

19898 

43 

26463 

43 

26414 

43 

799 

43 

5242 

Email  Address 

j im . peterson@enron . com . 

undisclosed-recipients@enron . com 

anngtc . dl-ets@enron. com . 

steven . p . south@enron . com . 

ebiz@enron . com . 

bruce . bowden@enron . com . 

richard . gilliland@enron . com . 

colin . poon@enron .com . 

anne . c . koehler@enron . com . 

itrezzo . agent@enron . com . 

sheryl’ . ’gussett@enron.com . 

mar itta . mullet@enron . com . 

katr ina . chapman@enron . com . 

daemon . extra@enron . com . 

corey . hollander@enron . com . 

boudreaux . j  ohn@enron . com . 

newsome . linda@enron . com . 

dl-ga-all_domestic@enron.com. . . . 
j  ay . patel@enron . com . 


Name 


undisclosed-recipien 

DL-ETS  ANNGTC . 

Steve  South . 

eBiz . 


Richard  Gilliland. . . 
Colin  Poon  Tip . 


Itrezzo  Agent 


EXTRA  Mailer  Daemon. 


John  Boudreaux 


DL-GA-all_domestic . . 


0.000670644318918 

0.000650229323052 

0.000650060151848 

0.000643280571798 

0.000629237284975 

0.000627229037215 

0.000615150204166 

0.000610940473321 

0.000590838897926 

0.000567691789362 

0.000561229286806 

0.000544563372251 

****************************************** 


p(z|u) 

0.000249575678766 

0.000225172162911 

0.000181271087772 

0.000180816666327 

0.000167467120787 

0.000165167437456 

0.000160509581501 

0.000156107235208 

0.000139679454219 

0.000136195817496 

0.000135645126650 

0.000131422272010 

0.000129431166380 

0.000124903059073 

0.000124859520390 

0.000123440754475 

0.000120691890243 

0.000119694054478 

0.000119494829467 
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43  11176  milagros.daetzSenron.com 


0.000113096329756 


CATEGORY  44 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  4757  COMPONENTS:  6 

LARGEST  COMPONENT  SIZE:  4745  PERCENT  OF  TOTAL  GRAPH:  99.75'/, 
GROUP  DEGREE:  0.27845  GRAPH  DENSITY:  0.00105 

GROUP  CLOSENESS:  0.00647  GROUP  BETWEENNESS:  0.45969 

AVERAGE  p(z I u) :  0.02  STDEVp(zlu):  0.03 

MOST  PROBABLE  USERS 


Topic#  ID# 

Email  Address 

Name 

p(z|u) 

44 

6031 

outlook . team® enron . com . 

. .  0.001457639238429 

44 

4987 

billy . dorseyOenron . com . 

. .  0.001448022707578 

44 

1285 

suzanne . danzOenron . com . 

Danz . 

. .  0.001390803746101 

44 

38627 

mcmahon@enron . com . 

. .  0.001075689070940 

44 

215 

stacey . whiteOenron . com . 

. .  0.001035505396680 

44 

4116 

katherine . brownOenron . com . 

. .  0.001001321457678 

44 

12042 

georgene . mooreOenron . com . 

. . .  Georgene 

Moore .... 

. .  0.000704291034546 

44 

17437 

elaine . overturf @enr on . com . 

. .  0.000633403923622 

44 

7112 

zionette . vincentOenron . com . 

. . .  Zionette 

Vincent . . 

. .  0.000605428407510 

44 

4963 

hilda.bourgeois-galloway@enron.com. . . 

. .  0.000595142103078 

44 

5484 

vanessa . groscrand@enron . com . 

. .  0.000571125577302 

44 

6097 

simone . rose@enron . com . 

. .  0.000559112815457 

44 

10552 

dan . bruce@enron . com . 

. .  0.000524759469048 

44 

34175 

kathryn . thomas@enron . com . 

. . .  Kathryn  Thomas  .... 

. .  0.000514232504702 

44 

1244 

barbara . hooks@enron . com . 

. . .  Barbara 

Hooks . 

. .  0.000506891760699 

44 

4063 

sherri . reinartz@enron . com . 

. .  0.000504281954880 

44 

57673 

legal . 7@enron . com . 

. .  0.000466587322574 

44 

5060 

j  ana . paxton@enron . com . 

. .  0.000466130256002 

44 

18048 

rick . carson@enron . com . 

. .  0.000450157316280 

44 

35725 

daniel . lyons@enron . com . 

. . .  Daniel  Lyons . 

. .  0.000423396381388 

CATEGORY  45 

EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  3351  COMPONENTS:  6 

LARGEST  COMPONENT  SIZE:  3340  PERCENT  OF  TOTAL  GRAPH:  99.67'/, 
GROUP  DEGREE:  0.06876  GRAPH  DENSITY:  0.00090 

GROUP  CLOSENESS:  0.00745  GROUP  BETWEENNESS:  0.14918 

AVERAGE  p(z | u) :  0.02  STDEVp(zlu):  0.02 
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MOST  PROBABLE  USERS 


Topic#  ID# 
45  1181 
45  3430 
45  642 
45  29154 
45  8776 
45  81667 
45  66671 
45  1383 
45  65104 
45  7122 
45  11353 
45  19276 
45  46356 
45  5727 
45  16538 
45  2614 
45  1878 
45  24005 
45  501 
45  9128 


Email  Address  Name 

exchange . administratorOenron . com . 

system . administrator@enron . com . 

eric.bassOenron.com .  Eric  Bass . 

postmasterOenron .com . 

thomas.underwoodOenron.com .  Thomas  Underwood.... 

brewerOenron . com . 

moscosoOenron . com . 

nick.hiemstraOenron.com .  Nick  Hiemstra . 

meredithOenron .  com .  " . 

rafael.avilaOenron.com .  Rafael  Avila . 

mark . morrowOenron . com . 

nicholas . stephanOenron . com . 


aaron . klemmOenron . com . 

rob . gayOenron . com . 

.russellOenron.com .  e-mail . 

christa.winfreyOenron.com .  Christa  Winfrey 

bryan.hullOenron.com .  Bryan  Hull . 

suzanne.russellOenron.com .  Suzanne  Russell 

misti.dayOenron.com .  Misti  Day . 

randall . gayOenron . com . 


p(z|u) 

0.001517057163547 

0.001380664474799 

0.000836370999018 

0.000829706720554 

0.000782691820933 

0.000670754388186 

0.000593532000242 

0.000566085668242 

0.000454517520423 

0.000414596161193 

0.000409627634599 

0.000362676763839 

0.000359370679031 

0.000351821963634 

0.000348453239846 

0.000333860854958 

0.000330947117717 

0.000330474580949 

0.000321193293553 

0.000316053128149 


CATEGORY  46 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  2525  COMPONENTS:  18 

LARGEST  COMPONENT  SIZE:  2483  PERCENT  OF  TOTAL  GRAPH:  98. 34'/. 
GROUP  DEGREE:  0.13137  GRAPH  DENSITY:  0.00238 

GROUP  CLOSENESS:  0.00112  GROUP  BETWEENNESS:  0.22905 

AVERAGE  p(z | u) :  0.02  STDEVp(zlu):  0.04 

MOST  PROBABLE  USERS 


Topic#  ID# 

Email  Address 

Name 

p(z|u) 

46 

22503 

carlos . j . rodriguezOenron . com . 

.  carlos 

. j . rodriguez . . 

0.001361925244531 

46 

22502 

gary . a . hanksOenron . com . 

.  gary. a 

. . hanks . 

0.001358652912680 

46 

14685 

kate . f raserOenron . com . 

.  Kate  Fraser . 

0.001250193273273 

46 

6068 

j  oe . casas@enron . com . 

0.001169417430292 

46 

14755 

cindy . vachuska@enron . com . 

.  Cindy 

Vachuska . 

0.001142188670334 

46 

18724 

kevin . alvaradoOenron . com . 

.  Kevin 

Alvarado . 

0.001130869990743 

46 

63604 

lhs-gas . kvammen@enron . com . 

.  Kjell 

-  LHS-GAS  Kvam 

0.001123994126733 

46 

6070 

scott . loving@enron . com . 

0.001108816235622 
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46 

46 

46 

46 

46 

46 

46 

46 

46 

46 

46 

46 


6578  briant.baker@enron.com. . . . 
15567  earl.tisdale@enron.com.... 
20267  sabra.dinari@enron.com.... 

22831  stone.charlie@enron.com... 
63612  lhc-gas.kvammen@enron.com. 

1422  jessie.patterson@enron.com 
11081  rebecca.griffin@enron.com. 
6067  alvin.thompson@enron.com.. 
2999  margie.straight@enron.com. 

22832  avila.david@enron.com . 

20241  susan.hadix@enron.com . 

63316  hfs.reite@enron.com . 


Briant  Baker .  0.001088074634254 

.  0.001017305226123 

.  0.001016906056959 

.  0.001015577070221 

Kjell  -  LHC-GAS  Kvam  0.001000535662754 
Jessie  Patterson....  0.000863189169090 

Rebecca  Griffin .  0.000848303716935 

.  0.000846458822023 

Margie  Straight .  0.000842621902831 

.  0.000786294011057 

.  0.000767909391233 

NILS  -  B.  Superinten  0.000754027412323 


***************************************************************************************************************** 
CATEGORY  47 


EXPLICIT  SOCIAL  NETWORK  STATISTICS 

VERTICES:  5813  COMPONENTS:  14 

LARGEST  COMPONENT  SIZE:  5772  PERCENT  OF  TOTAL  GRAPH:  99 . 29°/. 
GROUP  DEGREE:  0.11283  GRAPH  DENSITY:  0.00069 

GROUP  CLOSENESS:  0.00087  GROUP  BETWEENNESS:  0.22968 

AVERAGE  p(z | u) :  0.02  STDEVp(zlu):  0.01 


MOST  PROBABLE  USERS 


Topic#  ID# 

47 

2629 

47 

70 

47 

1115 

47 

44193 

47 

11221 

47 

53620 

47 

26363 

47 

37391 

47 

7076 

47 

1881 

47 

11289 

47 

38011 

47 

83584 

47 

1129 

47 

7614 

47 

3426 

47 

41170 

47 

2389 

47 

53621 

Email  Address 

bbutler2@enron . com . 

mark . brandOenron .com . 

clint . deanOenron .com . 

1 . f oust@enron . com . 

aar on . mart insenOenr on . com . . 
Culbertson . davidOenron . com . 

giron . kristi@enron . com . 

. j  udyOenron .com . 

zachary . mccarrollOenron . com 

greg . mart in@enron . com . 

sarah . goodpastorOenron . com . 
wienserski.dan@enron.com. . . 
adam.giannone@enron.com. . .  . 

andy . pace@enron . com . 

michael . simmons@enron . com . . 

mswerzb@ect . enron . com . 

enron . above@enron . com . 

robert.vargas@enron.com. . . . 
duke . kyle@enron . com . 


Name 


Mark  Brand 
Clint  Dean 


David  Culbertson. . . . 


e-mail . 

Zachary  McCarroll . . . 
Greg  Martin . 


Adam  Giannone . . 

Andy  Pace . 

Michael  Simmons 


Robert  Vargas 
Kyle  Duke .... 


p(z|u) 

0.001256182039730 

0.001010339147226 

0.000847763531732 

0.000531142021457 

0.000530399196322 

0.000505657927192 

0.000499362665570 

0.000447030997618 

0.000416268350065 

0.000408819462068 

0.000393161899312 

0.000380934354936 

0.000368199021783 

0.000354615277351 

0.000353895939361 

0.000344006437805 

0.000328388216605 

0.000320739204939 

0.000320082224495 
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47  38027  ’kevin’Oenron. com 


Kevin 


0.000317172268516 
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